Hi @sjaeckel , not sure where to open a discussion. Issue or pull request? I implemented SHA-1 accelerated by x86 intrinsics.
But, for that I would need some support from LibTomCrypt architecture:
- How to detect at compile-time what processor architecture are we compiling for? Is it x86 or x64 or some other one?
- How to detect compiler used? Is it GCC? Or clang? Or MSVC? Or something else which I do not know how to write intrinsics for?
- How to detect SHA-1 (and SSE2 and SSSE3 and SSE4.1) instruction set presence at run-time? Use the
cpuid instruction and cache its result. But where? Is there a precedent already in the lib?
- How to detect the ability to use
alignas or pragma pack(n) or something to force variable alignment to 16 bytes?
- How to add a new file into all those make, cmake and other build systems? Because I have no idea.
My idea is to have two implementations of SHA-1 ("sha-1-portable" and "sha-1-x86") in parallel at once in the lib (if the user wants to). Then have third, generic "sha-1" one, which "just" dispatches its logic into one of the previous ones, depending on CPUID results.
My code is at: https://github.com/MarekKnapek/libtomcrypt/commits/SHA-1x86/.
Hi @sjaeckel , not sure where to open a discussion. Issue or pull request? I implemented SHA-1 accelerated by x86 intrinsics.
But, for that I would need some support from LibTomCrypt architecture:
cpuidinstruction and cache its result. But where? Is there a precedent already in the lib?alignasorpragma pack(n)or something to force variable alignment to 16 bytes?My idea is to have two implementations of SHA-1 ("sha-1-portable" and "sha-1-x86") in parallel at once in the lib (if the user wants to). Then have third, generic "sha-1" one, which "just" dispatches its logic into one of the previous ones, depending on CPUID results.
My code is at: https://github.com/MarekKnapek/libtomcrypt/commits/SHA-1x86/.