Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Power8 AES Encryption #497

Closed
noloader opened this issue Sep 14, 2017 · 1 comment
Closed

Add Power8 AES Encryption #497

noloader opened this issue Sep 14, 2017 · 1 comment
Labels

Comments

@noloader
Copy link
Collaborator

noloader commented Sep 14, 2017

This is a change control item. Power8 encryption using PPC built-ins was added in several steps:

The numbers for AES are shown below. GCC112 is a little-endian machine, while GCC119 is a big-endian machine.

A related issue was Improve under-aligned buffers for AltiVec and Power8. We did not clear all instances of under-aligned buffers. Rather, we cleaned-up an alignment issue when it was inexpensive and profitable. I.e., it was not too much work and it improved performance.

noloader referenced this issue Sep 14, 2017
This is the forward direction on encryption only.  Crypto++ uses the "Equivalent Inverse Cipher" (FIPS-197, Section 5.3.5, p.23), and it is not compatible with IBM hardware. The library library will need to re-work the decryption key scheduling routines. (We may be able to work around it another way, but I have not investigated it).
noloader referenced this issue Sep 14, 2017
This increases performance to about 1.6 cpb. We are about 0.5 cpb behind Botan, and about 1.0 cpb behind OpenSSL. However, it beats the snot out of C/C++, which runs at 20 to 30 cpb
noloader referenced this issue Sep 14, 2017
Perforamnce increased for all modes when performing 6x vs 4x. 8x and 12x performed worse.

Here are the numbers:
4x Blocks:

<TR><TH>AES/CTR (128-bit key)<TD>1563<TD>2.1<TD>0.409<TD>1392
<TR><TH>AES/CTR (192-bit key)<TD>1403<TD>2.3<TD>0.450<TD>1529
<TR><TH>AES/CTR (256-bit key)<TD>1280<TD>2.5<TD>0.482<TD>1639
<TR><TH>AES/CBC (128-bit key)<TD>582<TD>5.6<TD>0.359<TD>1222
<TR><TH>AES/CBC (192-bit key)<TD>517<TD>6.3<TD>0.394<TD>1339
<TR><TH>AES/CBC (256-bit key)<TD>474<TD>6.8<TD>0.432<TD>1469
<TR><TH>AES/OFB (128-bit key)<TD>533<TD>6.1<TD>0.402<TD>1368
<TR><TH>AES/CFB (128-bit key)<TD>563<TD>5.8<TD>0.461<TD>1568
<TR><TH>AES/ECB (128-bit key)<TD>1829<TD>1.8<TD>0.240<TD>817

6x Blocks:

<TR><TH>AES/CTR (128-bit key)<TD>1750<TD>1.7<TD>0.406<TD>1300
<TR><TH>AES/CTR (192-bit key)<TD>1638<TD>1.9<TD>0.447<TD>1432
<TR><TH>AES/CTR (256-bit key)<TD>1528<TD>2.0<TD>0.482<TD>1541
<TR><TH>AES/CBC (128-bit key)<TD>582<TD>5.2<TD>0.358<TD>1145
<TR><TH>AES/CBC (192-bit key)<TD>517<TD>5.9<TD>0.394<TD>1260
<TR><TH>AES/CBC (256-bit key)<TD>474<TD>6.4<TD>0.431<TD>1379
<TR><TH>AES/OFB (128-bit key)<TD>533<TD>5.7<TD>0.400<TD>1281
<TR><TH>AES/CFB (128-bit key)<TD>563<TD>5.4<TD>0.461<TD>1476
<TR><TH>AES/ECB (128-bit key)<TD>1950<TD>1.6<TD>0.238<TD>763
noloader referenced this issue Sep 18, 2017
…ian machines

The refactoring has no effect on little endian machines. However, on big endian GCC119 using GCC 7.1 the performance improved by 2.5x for ECB and CTR modes:

BEFORE:

<TR><TH>AES/CTR (128-bit key)<TD>2723<TD>1.4<TD>0.163<TD>670
<TR><TH>AES/CTR (192-bit key)<TD>2560<TD>1.5<TD>0.175<TD>719
<TR><TH>AES/CTR (256-bit key)<TD>2728<TD>1.4<TD>0.183<TD>749
<TR><TH>AES/CBC (128-bit key)<TD>1204<TD>3.2<TD>0.135<TD>554
<TR><TH>AES/CBC (192-bit key)<TD>1066<TD>3.7<TD>0.148<TD>605
<TR><TH>AES/CBC (256-bit key)<TD>948<TD>4.1<TD>0.155<TD>635
<TR><TH>AES/OFB (128-bit key)<TD>1019<TD>3.8<TD>0.158<TD>648
<TR><TH>AES/CFB (128-bit key)<TD>949<TD>4.1<TD>0.192<TD>787
<TR><TH>AES/ECB (128-bit key)<TD>3564<TD>1.1<TD>0.082<TD>337

AFTER:

<TR><TH>AES/CTR (128-bit key)<TD>6484<TD>0.6<TD>0.163<TD>677
<TR><TH>AES/CTR (192-bit key)<TD>5641<TD>0.7<TD>0.176<TD>728
<TR><TH>AES/CTR (256-bit key)<TD>5005<TD>0.8<TD>0.183<TD>761
<TR><TH>AES/CBC (128-bit key)<TD>1223<TD>3.2<TD>0.135<TD>559
<TR><TH>AES/CBC (192-bit key)<TD>1080<TD>3.7<TD>0.147<TD>611
<TR><TH>AES/CBC (256-bit key)<TD>966<TD>4.1<TD>0.155<TD>642
<TR><TH>AES/OFB (128-bit key)<TD>1057<TD>3.7<TD>0.158<TD>656
<TR><TH>AES/CFB (128-bit key)<TD>1217<TD>3.3<TD>0.186<TD>774
<TR><TH>AES/ECB (128-bit key)<TD>7289<TD>0.5<TD>0.082<TD>342
@noloader
Copy link
Collaborator Author

noloader commented Sep 18, 2017

Here are the final numbers using Power8 in-core crypto acceleration for AES.

GCC112 (Linux, little endian, 3.2 GHz), GCC 4.8:

Algorithm MiB/Second Cycles Per Byte u-sec key setup cycles key setup
AES/CTR (128-bit key) 3012 1.0 0.408 1305
AES/CTR (192-bit key) 2695 1.1 0.447 1431
AES/CTR (256-bit key) 2381 1.3 0.482 1541
AES/CBC (128-bit key) 602 5.1 0.358 1145
AES/CBC (192-bit key) 528 5.8 0.397 1270
AES/CBC (256-bit key) 474 6.4 0.433 1386
AES/OFB (128-bit key) 551 5.5 0.407 1303
AES/CFB (128-bit key) 575 5.3 0.453 1449
AES/ECB (128-bit key) 3758 0.8 0.239 766

GCC112 (Linux, little endian, 3.15 GHz), XL/C 13.01:

Algorithm MiB/Second Cycles Per Byte u-sec key setup cycles key setup
AES/CTR (128-bit key) 721 4.2 0.442 1415
AES/CTR (192-bit key) 624 4.9 0.485 1553
AES/CTR (256-bit key) 551 5.5 0.520 1664
AES/CBC (128-bit key) 291 10.5 0.379 1211
AES/CBC (192-bit key) 267 11.4 0.416 1332
AES/CBC (256-bit key) 249 12.3 0.451 1442
AES/OFB (128-bit key) 288 10.6 0.436 1394
AES/CFB (128-bit key) 291 10.5 0.513 1641
AES/ECB (128-bit key) 764 4.0 0.269 860

GCC119 (AIX, big endian, 4.15 GHz), GCC 6.1:

Algorithm MiB/Second Cycles Per Byte u-sec key setup cycles key setup
AES/CTR (128-bit key) 6487 0.6 0.163 677
AES/CTR (192-bit key) 5658 0.7 0.176 729
AES/CTR (256-bit key) 4998 0.8 0.183 761
AES/CBC (128-bit key) 1223 3.2 0.135 559
AES/CBC (192-bit key) 1080 3.7 0.148 612
AES/CBC (256-bit key) 967 4.1 0.155 642
AES/OFB (128-bit key) 1057 3.7 0.158 656
AES/CFB (128-bit key) 1219 3.2 0.187 777
AES/ECB (128-bit key) 7287 0.5 0.082 342

GCC119 (AIX, big endian, 4.15 GHz), XL/C 13.01:

Algorithm MiB/Second Cycles Per Byte u-sec key setup cycles key setup
AES/CTR (128-bit key) 1538 2.6 0.218 903
AES/CTR (192-bit key) 1324 3.0 0.233 965
AES/CTR (256-bit key) 1159 3.4 0.235 975
AES/CBC (128-bit key) 802 4.9 0.184 765
AES/CBC (192-bit key) 727 5.4 0.195 811
AES/CBC (256-bit key) 675 5.9 0.201 836
AES/OFB (128-bit key) 747 5.3 0.209 869
AES/CFB (128-bit key) 783 5.1 0.245 1017
AES/ECB (128-bit key) 1640 2.4 0.094 388

Here are the original numbers without acceleration.

GCC112, GCC 4.8, no intrinsics:

Algorithm MiB/Second Cycles Per Byte u-sec key setup cycles key setup
AES/CTR (128-bit key) 121 25.3 0.334 1067
AES/CTR (192-bit key) 107 28.6 0.386 1235
AES/CTR (256-bit key) 76 40.1 0.382 1221
AES/CBC (128-bit key) 94 32.4 0.284 908
AES/CBC (192-bit key) 102 29.9 0.313 1000
AES/CBC (256-bit key) 93 32.9 0.333 1066
AES/OFB (128-bit key) 120 25.5 0.330 1058
AES/CFB (128-bit key) 121 25.3 0.472 1510
AES/ECB (128-bit key) 122 24.9 0.170 544

GCC112, XL/C 13.01, no intrinsics

Algorithm MiB/Second Cycles Per Byte u-sec key setup cycles key setup
AES/CTR (128-bit key) 141 21.6 0.367 1175
AES/CTR (192-bit key) 125 24.4 0.405 1295
AES/CTR (256-bit key) 111 27.4 0.421 1349
AES/CBC (128-bit key) 131 23.2 0.303 969
AES/CBC (192-bit key) 115 26.5 0.332 1061
AES/CBC (256-bit key) 104 29.3 0.353 1130
AES/OFB (128-bit key) 141 21.6 0.365 1167
AES/CFB (128-bit key) 144 21.2 0.479 1534
AES/ECB (128-bit key) 144 21.2 0.194 619

@noloader noloader added Enhancement gcc-7 GCC compiler version 7 XLC gcc-4 GCC compiler version 4 PowerPC and removed gcc-7 GCC compiler version 7 labels Sep 19, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant