We introduce the following optimizations, that can be followed in the different commits and branches of this repository:
- Decrease the rejection-rate in the coefficient-sampling
- Vectorize the coefficient-sampling using AVX2 and AVX512 instructions
- Exchange SHAKE-128 for pseudorandomness generation with faster SHA-256 or AES-256
The goal is to demonstrate possible optimizations by leveraging modern processor architecture features. The following table presents our performance improvements, measured with the included testbench on an Intel(R) Core(TM) i7-4600U CPU @ 2.70 GHz. The different optimizations are incremental (include the previous ones) except for the last two; there is either SHA-256 or AES-256 for generating pseudorandom bytes.
|Optimization||Server cycles (keygen+shareda)||Improvement||Client cycles (sharedb)||Improvement|
- Shay Gueron (1, 2)
- Fabian Schlieker (3)
(1) Intel Corporation, Israel Development Center, Haifa, Israel
(2) University of Haifa, Israel
(3) Ruhr University Bochum, Germany
This research was supported by the PQCRYPTO project, which was partially funded by the European Commission Horizon 2020 research Programme, grant #645622, by the ISRAEL SCIENCE FOUNDATION (grant No. 1018/16), and by the Blavatnik Interdisciplinary Cyber Research Center (ICRC) at the Tel Aviv University.
Modified work Copyright (c) 2016, Shay Gueron and Fabian Schlieker