Optimized implementation of BlaBla for SSE2/SSSE3/AVX2
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
.gitignore
LICENSE
Makefile
README.md
bench.c
blabla-opt.c
blabla-ref.c
blabla.h
config.h
test.c

README.md

Optimized implementation of BlaBla for SSE2/SSSE3/AVX2

This project is an optimized implementation of BlaBla for CPUs supporting SSE2, SSSE3 or AVX2 instructions. A reference C implementation is also provided for comparison. Another reference C implementation was written by Frank Denis.

The optimization strategy is inspired by the AVX2 ChaCha implementation by Samuel Neves.

Benchmarks

The project still lacks extensive benchmarks on multiple architectures, but current tests suggest ~15% performance improvement over AVX2 ChaCha implementation for the same number of rounds.

Testing

You can check that the code compiles and benchmark the various implementations as follows.

make
./bench-ref
./bench-opt-sse2
./bench-opt-ssse3
./bench-opt-avx2

Authors

Guillaume Endignoux, while intern at Kudelski Security