Skip to content
This repository has been archived by the owner on Jul 23, 2023. It is now read-only.

Shortening #-of Vector Lanes #16

Merged
merged 14 commits into from Dec 29, 2021
Merged

Shortening #-of Vector Lanes #16

merged 14 commits into from Dec 29, 2021

Conversation

itzmeanjan
Copy link
Owner

@itzmeanjan itzmeanjan commented Dec 27, 2021

Previously I represented Rescue Prime hash state using vector with 16 lanes, where each element was 64 -bit unsigned integer ( i.e. sycl::ulong16 ), motivated by discussion here rust-lang/portable-simd#215 , I decided to rewrite rescue prime hash routines, while representing hash state using sycl::ulong4[3], which is exactly 12 -field elements wide; not wasting any space like I was doing before.

See how performance of Rescue Prime Hash and Merklization improves after aforementioned change is benchmarked on Nvidia Tesla V100

And on Intel CPU perform improvements due to vector lane shortening are visible here

Rescue Prime Merge

Platform # -of Work Items Op/ sec
Intel CPU 4096 x 4096 1.41329e+06
Nvidia GPU 4096 x 4096 2.02718e+07

Merklization using Rescue Prime Merge

Platform # -of Work Items Total Time
Intel CPU 8388608 5974.54 ms
Nvidia GPU 8388608 431.217 ms

@itzmeanjan itzmeanjan merged commit a3152cd into main Dec 29, 2021
@itzmeanjan itzmeanjan deleted the shorter-vector-lanes branch December 29, 2021 05:34
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant