Skip to content

v1.0.0-beta.2

@FiloSottile FiloSottile tagged this 12 Dec 13:36
By allowing all limbs to be up to 52 bits between operations, which was
already allowed by all out code, we can make the carry propagation more
parallelizable. Seems to help the compiler more than the handwritten asm.

name                    old time/op  new time/op  delta
Add-8                   7.77ns ±19%  6.43ns ± 1%  -17.16%  (p=0.000 n=10+8)
Mul-8                   26.3ns ± 0%  24.6ns ± 1%   -6.32%  (p=0.000 n=9+10)
Mul32-8                 5.86ns ± 1%  5.87ns ± 1%     ~     (p=0.171 n=10+10)
WideMultCall-8          2.54ns ± 0%  2.54ns ± 0%     ~     (p=0.965 n=9+8)
BasepointMul-8          18.6µs ± 1%  18.7µs ± 1%     ~     (p=0.095 n=9+10)
ScalarMul-8             65.6µs ± 3%  63.9µs ± 1%   -2.63%  (p=0.000 n=10+9)
VartimeDoubleBaseMul-8  61.1µs ± 1%  60.7µs ± 2%   -0.73%  (p=0.017 n=10+9)
MultiscalarMulSize8-8    224µs ± 1%   224µs ± 1%     ~     (p=0.182 n=10+9)
Assets 2
Loading