Port cksum
builtin performance improvements from illumos
#391
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This commit ports performance optimizations from illumos for the libsum code (used by the
cksum
andsum
builtins):illumos@98bea71
The new codepath in libsum uses prefetching and loop unrolling to improve performance (prefetching is done with
__builtin_prefetch()
orsun_prefetch_read_many()
if either is available).Script for testing (note that
cksum
must be enabled insrc/cmd/ksh93/data/builtins.c
):Results on Linux x86_64 (using
CCFLAGS=-O2
):src/lib/libsum/{sum-att.c,sum-crc.c,Mamfile}:
- Port the performance optimizations from illumos to 93u+m libsum. To prevent problems with older versions of GCC, avoid the new codepath if GCC is older than the 3.1 release series. Additionally, the
ast.h
header must be included to handle tcc defining__GNUC__
on FreeBSD.- Apply some build fixes to allow the new codepath to build with Clang 3.6 and newer (my own testing indicates an even better performance improvement with Clang than with GCC).