Skip to content
This repository has been archived by the owner on Jul 1, 2022. It is now read-only.

element.Mul now uses ADOX ADCX and MULX when available #13

Merged
merged 13 commits into from
Apr 8, 2020
Merged

Conversation

gbotrel
Copy link
Collaborator

@gbotrel gbotrel commented Apr 7, 2020

Preliminary benchmarks:
for a 4 word modulus (bn256):

bn256 [develop] $ benchcmp old.txt new.txt 
benchmark                       old ns/op     new ns/op     delta
BenchmarkMulAssignELEMENT-8     28.1          21.8          -22.42%
BenchmarkMulAssignELEMENT-8     28.8          21.8          -24.31%
BenchmarkMulAssignELEMENT-8     28.9          21.8          -24.57%

for a 6 word modulus (bls377, bls381):
cycles count: 121 (vs 125 for herumi/mcl)

benchmark                       old ns/op     new ns/op     delta
BenchmarkMulAssignELEMENT-8     53.9          40.7          -24.49%
BenchmarkMulAssignELEMENT-8     54.0          40.9          -24.26%
BenchmarkMulAssignELEMENT-8     54.0          40.3          -25.37%

internal/template/asm/mul.go contains code generation for the amd64 multiplication. It checks for availability of ADDX and MULX and jump to the standard version if not present.

To benefit from the ADCX and ADOX carry chains we split the inner loops in 2 (https://hackmd.io/@zkteam/modular_multiplication) :

    // for i=0 to N-1
    // 		for j=0 to N-1
    // 		    (A,t[j])  := t[j] + a[j]*b[i] + A
    // 		m := t[0]*q'[0] mod W
    // 		C,_ := t[0] + m*q[0]
    // 		for j=1 to N-1
    // 		    (C,t[j-1]) := t[j] + m*q[j] + C
    // 		t[N-1] = C + A

@gbotrel gbotrel mentioned this pull request Apr 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant