I've transliterated the Linux kernel version of native SHA1 instructions to Go's flavor of assembly. The result is a 1.5x to 3x speed-up on my Ryzen 5 1600. Could this be included in Go, similar to the AVX2 implementation that is already in crypto/sha1?