Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port cksum builtin performance improvements from illumos #391

Merged
merged 1 commit into from
Dec 21, 2021

Conversation

JohnoKing
Copy link

This commit ports performance optimizations from illumos for the libsum code (used by the cksum and sum builtins):
illumos@98bea71
The new codepath in libsum uses prefetching and loop unrolling to improve performance (prefetching is done with __builtin_prefetch() or sun_prefetch_read_many() if either is available).

Script for testing (note that cksum must be enabled in src/cmd/ksh93/data/builtins.c):

$ cat /tmp/foo
#!/bin/ksh
builtin cksum || exit 1
for ((i=0; i!=50000; i++)) do
    cksum -x att /etc/hosts
done >/dev/null

Results on Linux x86_64 (using CCFLAGS=-O2):

$ echo 'UNPATCHED:'; time arch/linux.i386-64/bin/ksh /tmp/foo; echo 'PATCHED'; time /tmp/ksh /tmp/foo
UNPATCHED:

real	0m09.989s
user	0m07.582s
sys	0m02.406s
PATCHED:

real	0m06.536s
user	0m04.331s
sys	0m02.204s

src/lib/libsum/{sum-att.c,sum-crc.c,Mamfile}:
- Port the performance optimizations from illumos to 93u+m libsum. To prevent problems with older versions of GCC, avoid the new codepath if GCC is older than the 3.1 release series. Additionally, the ast.h header must be included to handle tcc defining __GNUC__ on FreeBSD.
- Apply some build fixes to allow the new codepath to build with Clang 3.6 and newer (my own testing indicates an even better performance improvement with Clang than with GCC).

This commit ports performance optimizations from illumos for the libsum
code (used by the cksum and sum builtins):
illumos@98bea71
The new codepath in libsum uses prefetching and loop unrolling to
improve performance (prefetching is done with __builtin_prefetch()
or sun_prefetch_read_many() if either is available).

Script for testing (note that cksum must be enabled in
src/cmd/ksh93/data/builtins.c):
   #!/bin/ksh
   builtin cksum || exit 1
   for ((i=0; i!=50000; i++)) do
       cksum -x att /etc/hosts
   done >/dev/null

Results on Linux x86_64 (using CCFLAGS=-O2):
$ echo 'UNPATCHED:'; time arch/linux.i386-64/bin/ksh /tmp/foo; echo 'PATCHED'; time /tmp/ksh /tmp/foo
UNPATCHED:

real	0m09.989s
user	0m07.582s
sys	0m02.406s
PATCHED:

real	0m06.536s
user	0m04.331s
sys	0m02.204s

src/lib/libsum/{sum-att.c,sum-crc.c,Mamfile}:
- Port the performance optimizations from illumos to 93u+m libsum. To
  prevent problems with older versions of GCC, avoid the new codepath
  if GCC is older than the 3.1 release series. Additionally, the ast.h
  header must be included to handle tcc defining __GNUC__ on FreeBSD.
- Apply some build fixes to allow the new codepath to build with Clang
  3.6 and newer (my own testing indicates an even better performance
  improvement with Clang than with GCC).
@JohnoKing JohnoKing force-pushed the cksum-performance-improvements branch from a015aaf to cedf849 Compare December 20, 2021 20:00
@McDutchie McDutchie merged commit ac0a3a5 into ksh93:master Dec 21, 2021
@JohnoKing JohnoKing deleted the cksum-performance-improvements branch December 21, 2021 08:03
McDutchie pushed a commit that referenced this pull request Dec 28, 2021
This commit ports performance optimizations from illumos for the libsum
code (used by the cksum and sum builtins):
illumos@98bea71
The new codepath in libsum uses prefetching and loop unrolling to
improve performance (prefetching is done with __builtin_prefetch()
or sun_prefetch_read_many() if either is available).

Script for testing (note that cksum must be enabled in
src/cmd/ksh93/data/builtins.c):
   #!/bin/ksh
   builtin cksum || exit 1
   for ((i=0; i!=50000; i++)) do
       cksum -x att /etc/hosts
   done >/dev/null

Results on Linux x86_64 (using CCFLAGS=-O2):
$ echo 'UNPATCHED:'; time arch/linux.i386-64/bin/ksh /tmp/foo; echo 'PATCHED'; time /tmp/ksh /tmp/foo
UNPATCHED:

real    0m09.989s
user    0m07.582s
sys     0m02.406s
PATCHED:

real    0m06.536s
user    0m04.331s
sys     0m02.204s

src/lib/libsum/{sum-att.c,sum-crc.c,Mamfile}:
- Port the performance optimizations from illumos to 93u+m libsum. To
  prevent problems with older versions of GCC, avoid the new codepath
  if GCC is older than the 3.1 release series. Additionally, the ast.h
  header must be included to handle tcc defining __GNUC__ on FreeBSD.
- Apply some build fixes to allow the new codepath to build with Clang
  3.6 and newer (my own testing indicates an even better performance
  improvement with Clang than with GCC).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants