this is to prevent long blocking of RTS when big chunks are used, this isn't necessary for init and finalize which are not dependant on any input.
…o avoid beeing flag as depending on hunit in the library.
all hashes here has a uint64 size, instead of the 2 uint32_t seen in reference implementation. instead of appending the size in a 2 steps process, optimise it into one single uint64 shift and swap.
instead of just fixing it, optimise the two step process, into a single uint64 shift and swap. sz has already been represented as an uint64 compared to the usual representation of 2 uint32_t anyway.
it was used before for testing manually unrolling the loop, which didn't yield any performance improvement.
This is mainly to facilitate future optimisations, and easier rewrite of do_chunk in asm. surprisingly, it also yield a small performance increase.
the C md2 implementation is a cross from rfc1319 and using the same skeleton as every other cryptohash.
incremental interface is now slightly faster.
…64 aligned address.