AVX2 vectorization of Poseidon S-box #244

nbgl · 2021-09-14T15:18:08Z

Shaves 4% 5% off hash times for width 8 and 9% 10% for width 12.

The function need interleaved to observe an improvement. My original attempt computed x^7 for one vector before moving onto another vector; this resulted in dependency chains that were too long, and the loop had too many instructions for CPU reordering to work. The compiler did not interleave the instructions itself, so I had to do it manually.

dlubarov

👍

nbgl added 2 commits September 14, 2021 08:12

AVX2 vectorization of Poseidon S-box

f3d6315

Minor doc

d70c517

nbgl requested a review from dlubarov September 14, 2021 15:18

Microoptimization

fb19eaf

nbgl mentioned this pull request Sep 14, 2021

Tracking issue: Poseidon vectorization #242

Closed

dlubarov approved these changes Sep 15, 2021

View reviewed changes

nbgl merged commit b411a27 into main Sep 15, 2021

nbgl deleted the jakub/avx2-poseidon-sbox branch September 15, 2021 02:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AVX2 vectorization of Poseidon S-box #244

AVX2 vectorization of Poseidon S-box #244

nbgl commented Sep 14, 2021 •

edited

Loading

dlubarov left a comment

AVX2 vectorization of Poseidon S-box #244

AVX2 vectorization of Poseidon S-box #244

Conversation

nbgl commented Sep 14, 2021 • edited Loading

dlubarov left a comment

Choose a reason for hiding this comment

nbgl commented Sep 14, 2021 •

edited

Loading