An implementation of the Skein hash function for ARMv7 with NEON
C Assembly


Skein for ARMv7 with NEON

This is an implementation of the Skein hash function, as described in The Skein Hash Function Family, version 1.3.

The implementation is targeted towards the ARM Cortex-A8 processor, and uses the NEON SIMD instructions to perform 64-bit operations, where possible in parallel.

There are also Skein-256 and Skein-512 implementations that do not require NEON support.

The code is based on the optimized C version written by Doug Whiting, with the block functions rewritten in ARM assembly language.


For long messages the implementation reaches the following speeds in cycles per byte when tested on a Cortex-A8 processor:

Skein-256 20.3
Skein-512 15.4
Skein-1024 20.2

Without NEON:

Skein-256 21.7
Skein-512 25.2

See performance_test.txt for more detailed test output.

Compiling and running the test program

The Skein test program can be compiled using GCC as follows:

gcc *.c skein_block_cortexa8.S -DSKEIN_USE_ASM=256+512+1024

Or without NEON:

gcc *.c skein_block_noneon.S -DSKEIN_USE_ASM=256+512

In order for the performance test to have an accurate timer, you will need to enable user mode access to the ARM performance monitor registers.

One way to do this is to use the kernel module included in the userperf folder.