Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

This patch changes the SHA512 code to avoid faults from mis-aligned accesses #90

Merged
merged 1 commit into from Sep 5, 2018

Conversation

Projects
None yet
4 participants
@radfordneal
Copy link
Contributor

radfordneal commented Aug 31, 2018

In particular, such faults arise with R-3.5.1 and digest-0.6.16 on a
SPARC T2 system running Solaris 11 when compiled with gcc 4.9.2 in
32-bit mode, as illustrated below:


R version 3.5.1 (2018-07-02) -- "Feather Spray"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: sparc-sun-solaris2.11 (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

library(digest)
txt <- ""
for (i in 1:10) txt <- paste0(txt,"abcdefghijklmnopqrstuvwxyz")
r <- digest(txt,alg="sha512",serialize=FALSE,skip=0)
r <- digest(txt,alg="sha512",serialize=FALSE,skip=8)
r <- digest(txt,alg="sha512",serialize=FALSE,skip=4)

*** caught bus error ***
address 24cb07c, cause 'invalid alignment'

Traceback:
1: digest(txt, alg = "sha512", serialize = FALSE, skip = 4)
An irrecoverable exception occurred. R is aborting now ...
Bus Error (core dumped)


Whether the problem occurs varies with the compiler used, since some
SPARC instructions require alignment and others do not (so depends on
which instructions the compiler chooses). The problem presumably also
occurs on other platforms with alignment requirements. (Perhaps some
ARM systems?)

The fix is made to sha2.[ch], which is used for SHA512. There is an
analogous problem with the SHA256 code in those files, which is also
fixed (untested) in this patch, but it is in any case not enabled in
the 'digest' package (which uses other code for SHA256).

The issue is in the SHA512_Transform function, which is used locally
in sha2.c. It is declared as

void SHA512_Transform(SHA512_CTX* context, const sha2_word64* data)

Alignment faults may arise when it is called as

SHA512_Transform(context, (sha2_word64*)data);

since 'data' may not be 64-bit aligned. It is also called as

SHA512_Transform(context, (sha2_word64*)context->buffer);

which does not produce a fault because context-buffer happens to be
64-bit aligned. This call does, however, violate the 'const'
attribute in the prototype, since context->buffer is written to inside
SHA512_Transform. (This aliasing violates the C standard, but is
unlikely to actually cause a bug.)

This patch modifies SHA512_Transform so it is declared as

static void SHA512_Transform(SHA512_CTX* context)

with data always coming from context->buffer. The previous call with
'data' as the second argument is replaced by

MEMCPY_BCOPY(context->buffer, data, SHA256_BLOCK_LENGTH);
SHA256_Transform(context);

The copy should handled any alignment for 'data'.

There is no measurable performance impact.

This patch changes the SHA512 code to avoid faults from mis-aligned
accesses on platforms where that is an issue.

In particular, such faults arise with R-3.5.1 and digest-0.6.16 on a
SPARC T2 system running Solaris 11 when is compiled with gcc 4.9.2 in
32-bit mode, as illustrated below:

------------------------------------------------------------------------

R version 3.5.1 (2018-07-02) -- "Feather Spray"
Copyright (C) 2018 The R Foundation for Statistical Computing
Platform: sparc-sun-solaris2.11 (32-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> library(digest)
> txt <- ""
> for (i in 1:10) txt <- paste0(txt,"abcdefghijklmnopqrstuvwxyz")
> r <- digest(txt,alg="sha512",serialize=FALSE,skip=0)
> r <- digest(txt,alg="sha512",serialize=FALSE,skip=8)
> r <- digest(txt,alg="sha512",serialize=FALSE,skip=4)

 *** caught bus error ***
address 24cb07c, cause 'invalid alignment'

Traceback:
 1: digest(txt, alg = "sha512", serialize = FALSE, skip = 4)
An irrecoverable exception occurred. R is aborting now ...
Bus Error (core dumped)

------------------------------------------------------------------------

Whether the problem occurs varies with the compiler used, since some
SPARC instructions require alignment and others do not (so depends on
which instructions the compiler chooses).  The problem presumably also
occurs on other platforms with alignment requirements.  (Perhaps some
ARM systems?)

The fix is made to sha2.[ch], which is used for SHA512. There is an
analogous problem with the SHA256 code in those files, which is also
fixed (untested) in this patch, but it is in any case not enabled in
the 'digest' package (which uses other code for SHA256).

The issue is in the SHA512_Transform function, which is used locally
in sha2.c.  It is declared as

   void SHA512_Transform(SHA512_CTX* context, const sha2_word64* data)

Alignment faults may arise when it is called as

   SHA512_Transform(context, (sha2_word64*)data);

since 'data' may not be 64-bit aligned.  It is also called as

   SHA512_Transform(context, (sha2_word64*)context->buffer);

which does not produce a fault because context-buffer happens to be
64-bit aligned.  This call does, however, violate the 'const'
attribute in the prototype, since context->buffer is written to inside
SHA512_Transform.  (This aliasing violates the C standard, but is
unlikely to actually cause a bug.)

This patch modifies SHA512_Transform so it is declared as

   static void SHA512_Transform(SHA512_CTX* context)

with data always coming from context->buffer.  The previous call with
'data' as the second argument is replaced by

   MEMCPY_BCOPY(context->buffer, data, SHA256_BLOCK_LENGTH);
   SHA256_Transform(context);

The copy should handled any alignment for 'data'.

There is no measurable performance impact.
@codecov-io

This comment has been minimized.

Copy link

codecov-io commented Aug 31, 2018

Codecov Report

Merging #90 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@          Coverage Diff          @@
##           master    #90   +/-   ##
=====================================
  Coverage     100%   100%           
=====================================
  Files          16     16           
  Lines        1612   1613    +1     
=====================================
+ Hits         1612   1613    +1
Impacted Files Coverage Δ
src/sha2.c 100% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ec40051...cdff541. Read the comment docs.

@eddelbuettel

This comment has been minimized.

Copy link
Owner

eddelbuettel commented Sep 3, 2018

Thanks a lot, @radfordneal, and apologies that this fell to the side for a few days.

@hannesmuehleisen You were the one who added the sha* summaries, would you be able to take a look at this? I am mildly concerned that some of the details are altered, but then again this is of course offset by the same that Radford is a very known and trusted source as well as the fact that the existing tests all pass (no proof, of course, but a feel-good factor).

Otherwise, would either one of you know of another implementation that passes muster on all little-vs-big endian, 32-vs-64 bit OS, ... cases? I presume that that Sparc system Radford uses here is a little unusual as we have not heard from anyone else yet?

@eddelbuettel

This comment has been minimized.

Copy link
Owner

eddelbuettel commented Sep 5, 2018

No word from @hannesmuehleisen , sadly, but I think this should be fine.

I'll try remember to release this to CRAN once CRAN reopens in a few days.

@eddelbuettel eddelbuettel merged commit ca8bf0c into eddelbuettel:master Sep 5, 2018

3 checks passed

codecov/patch 100% of diff hit (target 100%)
Details
codecov/project 100% (+0%) compared to ec40051
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@hannesmuehleisen

This comment has been minimized.

Copy link
Contributor

hannesmuehleisen commented Sep 10, 2018

Sorry only seeing this now. This is a great improvement, thanks, @radfordneal. As far as I'm concerned, as long as the tests and checks still pass, I'm happy.

@eddelbuettel

This comment has been minimized.

Copy link
Owner

eddelbuettel commented Sep 12, 2018

This is now on CRAN. As a side effect, it also took out an UBSAN report against the file.

I still have a few UBSAN issues left in xxhash.c (see the reports at https://cloud.r-project.org/web/checks/check_results_digest.html) all of which point at 4 (or 8) byte alignment.

If either of you, or @jimhester who contributed xxhash.c support, have spare cycles for this I would truly appreciate help with this.

@eddelbuettel

This comment has been minimized.

Copy link
Owner

eddelbuettel commented Sep 13, 2018

I took another look and I may have to tell CRAN that two remaining UBSAN issues (now updated after our newest version got onto CRAN with the changes by @radfordneal):

  • all #ifdef for an alignment choice in xxhash.c lines 227 and 240
  • in pmurhash.c line 276 is it hidden away in an upstream macro that has not changed

are possibly false positives. Any comments?

@radfordneal

This comment has been minimized.

Copy link
Contributor Author

radfordneal commented Sep 13, 2018

@eddelbuettel

This comment has been minimized.

Copy link
Owner

eddelbuettel commented Sep 13, 2018

Thanks a lot for suggesting this, Radford! I'll look into this this evening -- easy enough to swap the line and use the UBSAN-configured builds at RHub (though it may actually use the same Docker container I created and may need to update to whatever Brian Ripley uses these days...).

@eddelbuettel

This comment has been minimized.

Copy link
Owner

eddelbuettel commented Sep 13, 2018

From a first casual check -- of changing the one line after merging the @jimhester pr #91 -- it looks good!

Thanks again for the quick heads-up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.