Optimize ScalarMult using endomorphism #8

jimmysong · 2014-09-25T22:12:35Z

Optimize ScalarMult using endomorphism

This implements a speedup to ScalarMult using the endomorphism available to secp256k1.

Note the constants lambda, beta, a1, b1, a2 and b2 are from here:

https://bitcointalk.org/index.php?topic=3238.0

Preliminary tests indicate a speedup of between 23%-28% (BenchScalarMult).

More speedup can probably be achieved once splitK uses something more like what fieldVal uses. Unfortunately, the prime for this math is the order of G (N), not P.

Note the NAF optimization was specifically not done as that's the purview of another issue.

This closes #1

coveralls · 2014-09-25T22:17:24Z

Coverage decreased (-0.57%) when pulling 672ab72 on jimmysong:1 into 4ca0daa on conformal:master.

coveralls · 2014-09-25T22:19:45Z

Coverage decreased (-0.37%) when pulling 672ab72 on jimmysong:1 into 4ca0daa on conformal:master.

coveralls · 2014-09-25T22:21:03Z

Coverage decreased (-0.57%) when pulling 672ab72 on jimmysong:1 into 4ca0daa on conformal:master.

coveralls · 2014-09-25T22:53:26Z

Coverage decreased (-0.35%) when pulling 8843597 on jimmysong:1 into 4ca0daa on conformal:master.

jimmysong · 2014-09-26T00:16:54Z

I figured out why the splitK function needed the +3. It's because fieldVal doesn't take negative values. This optimization makes negative k values possible. However, adding enough to the c_n's at the right time seems to always produce positive numbers.

jimmysong · 2014-09-26T01:14:51Z

No more fudging add 3. There was also another error besides the sign which wasn't passed through. That was another error related to left-to-right addition that I fixed.

coveralls · 2014-09-26T01:16:42Z

Coverage increased (+0.12%) when pulling 5daa225 on jimmysong:1 into 4ca0daa on conformal:master.

jimmysong · 2014-09-26T01:28:55Z

Removed the modulo logic (actually unnecessary) from the splitK function. Gave a 8-9% speed boost.

coveralls · 2014-09-26T01:30:19Z

Coverage increased (+0.33%) when pulling 5aecdcf on jimmysong:1 into 4ca0daa on conformal:master.

coveralls · 2014-09-26T02:11:28Z

Coverage increased (+0.34%) when pulling b4cadcb on jimmysong:1 into 4ca0daa on conformal:master.

coveralls · 2014-09-26T02:45:38Z

Coverage decreased (-0.07%) when pulling 7425341 on jimmysong:1 into 4ca0daa on conformal:master.

coveralls · 2014-09-26T14:03:05Z

Coverage increased (+0.14%) when pulling 611c140 on jimmysong:1 into 4ca0daa on conformal:master.

coveralls · 2014-09-26T18:56:59Z

Coverage increased (+0.14%) when pulling 4f948a9 on jimmysong:1 into 4ca0daa on conformal:master.

davecgh · 2014-09-27T00:52:58Z

Thanks for this Jimmy.

I'll review it in detail over the weekend, but I briefly looked over it so far and confirmed the choices for λ and β are accurate. That is to say the following equations hold:

β^3 (mod P) = 1
λ^3 (mod N) = 1
λ^2 + λ + 1 (mod N) = 0

coveralls · 2014-09-27T01:26:17Z

Coverage increased (+0.32%) when pulling 8848df1 on jimmysong:1 into d694428 on conformal:master.

coveralls · 2014-09-27T01:28:53Z

Coverage increased (+0.11%) when pulling 386a945 on jimmysong:1 into d694428 on conformal:master.

coveralls · 2015-01-22T16:22:10Z

Coverage increased (+0.11%) to 97.97% when pulling 2206dff on jimmysong:1 into f9365fd on btcsuite:master.

jimmysong · 2015-01-22T16:24:10Z

@davecgh I've rebased this one so the merge is smooth. Would it be possible to look at this and #10 soonish? I'd hate for these pr's to just sit there.

coveralls · 2015-02-01T19:55:06Z

Coverage increased (+0.33%) to 97.25% when pulling 69d8e09 on jimmysong:1 into 9535058 on btcsuite:master.

jimmysong · 2015-02-03T05:06:45Z

@davecgh, I finally understand how lambda and beta are derived using Fermat's Little Theorem:

http://bitcoin.stackexchange.com/questions/35814/how-do-you-derive-the-lambda-and-beta-values-for-endomorphism-on-the-secp256k1-c/

Might be of interest to you since you were wondering this yourself.

davecgh · 2015-02-03T07:42:06Z

btcec.go

+// G^N = 1 and thus any other valid point on the elliptical curve has the
+// same order.
+func (curve *KoblitzCurve) moduloReduce(k []byte) []byte {
+	var newK []byte


I think we can do without this additional local here. Inside the first if branch, just return tmpK.Bytes(). Then just eliminate the else branch and return k.

if len(k) > curve.BitSize/8 { tmpK := new(big.Int).SetBytes(k) tmpK.Mod(tmpK, curve.N) return tmpK.Bytes() } return k

jimmysong · 2015-02-03T14:23:23Z

@davecgh @jrick , made changes to address your concerns.

jimmysong · 2015-02-03T15:06:42Z

@davecgh, @jrick, something is weird with travis. doesn't know the command "go get". Can you rerun?

davecgh · 2015-02-03T15:38:45Z

Thanks for updating @jimmysong. It appears Travis just updated their release version of Go, so I had to modify the .travis.yml. If you rebase on top of master, it will build again.

Also, I'll just modify it after the PR goes in to avoid going back and forth, but I absolutely do not like the const curveSizestuff. I know that currently it only does secp256k1, but I want to move the package more towards being able to support other curves, not further away like that change does.

The constant needs to be defined on the curve, so other curves can work as well.

jimmysong · 2015-02-03T16:03:11Z

@davecgh, rebased and took out the const stuff.

Is there a way to make a const field in a struct in go or is the way I did it acceptable?

coveralls · 2015-02-03T16:04:39Z

Coverage increased (+0.34%) to 97.05% when pulling 3a91c2f on jimmysong:1 into 46829e8 on btcsuite:master.

davecgh · 2015-02-03T16:10:23Z

@jimmysong The way you did it is great. Thanks!

This implements a speedup to ScalarMult using the endomorphism available to secp256k1. Note the constants lambda, beta, a1, b1, a2 and b2 are from here: https://bitcointalk.org/index.php?topic=3238.0 Preliminary tests indicate a speedup of between 17%-20% (BenchScalarMult). More speedup can probably be achieved once splitK uses something more like what fieldVal uses. Unfortunately, the prime for this math is the order of G (N), not P. Note the NAF optimization was specifically not done as that's the purview of another issue. Changed both ScalarMult and ScalarBaseMult to take advantage of curve.N to reduce k. This results in a 80% speedup to large values of k for ScalarBaseMult. Note the new test BenchmarkScalarBaseMultLarge is how that speedup number can be checked. This closes btcsuite#1

davecgh · 2015-02-04T17:24:51Z

gensecp256k1.go

@@ -18,8 +18,7 @@ var secp256k1BytePoints = []byte{}
 // 0..n-1 where n is the curve's bit size (256 in the case of secp256k1)
 // the coordinates are recorded as Jacobian coordinates.
 func (curve *KoblitzCurve) getDoublingPoints() [][3]fieldVal {
-	bitSize := curve.Params().BitSize


The bitSize is used below in the loop too. It should be updated to curve.BitSize as well or the generation code will fail to compile.

You can run rm secp256k1.go; go generate to test.

davecgh · 2015-02-04T18:41:06Z

Alright, so I'm letting this run every signature on the block chain before merging just to be paranoid, but I've independently derived and double checked all of the math and everything looks accurate. In particular:

The possible values for λ and ß are derived with:

λ = 2^((N-1) / 3) = ac9c52b33fa3cf1f5ad9e3fd77ed9ba4a880b9fc8ec739c2e0cfc810b51283ce
λ = 3^((N-1) / 3) = 5363ad4cc05c30e0a5261c028812645a122e22ea20816678df02967c1b23bd72
ß = 2^((P-1) / 3) = 7ae96a2b657c07106e64479eac3434e99cf0497512f58995c1396c28719501ee
ß = 3^((P-1) / 3) = 851695d49a83f8ef919bb86153cbcb16630fb68aed0a766a3ec693d68e6afa40

The two possible λ and ß values are indeed squares of one another:

λ^2 = ac9c52b33fa3cf1f5ad9e3fd77ed9ba4a880b9fc8ec739c2e0cfc810b51283ce^2 (mod N) = 5363ad4cc05c30e0a5261c028812645a122e22ea20816678df02967c1b23bd72
λ^2 = 5363ad4cc05c30e0a5261c028812645a122e22ea20816678df02967c1b23bd72^2 (mod N) = ac9c52b33fa3cf1f5ad9e3fd77ed9ba4a880b9fc8ec739c2e0cfc810b51283ce
ß^2 = 7ae96a2b657c07106e64479eac3434e99cf0497512f58995c1396c28719501ee^2 (mod P) = 851695d49a83f8ef919bb86153cbcb16630fb68aed0a766a3ec693d68e6afa40
ß^2 = 851695d49a83f8ef919bb86153cbcb16630fb68aed0a766a3ec693d68e6afa40^2 (mod P) = 7ae96a2b657c07106e64479eac3434e99cf0497512f58995c1396c28719501ee

Benchmarking the available options for λ and ß empirically show that the values chosen in this PR provide the greatest speedup. In particular, the generator for λ chosen is 3 while the generator for ß chosen is 2.

The values chosen for the linearly independent vectors used during computation of the endomorphism have been independently derived and verified to satisfy the equation f(v) = a+bλ mod N = 0 for the chosen λ:

a1 = 3086d221a7d46bcde86c90e49284eb15
b1 = -e4437ed6010E88286f547fa90abfe4c3
a2 = 114ca50f7a8e2f3f657c1108d9d44cfd8
b2 = 3086d221a7d46bcde86c90e49284eb15

3086d221a7d46bcde86c90e49284eb15 + -e4437ed6010E88286f547fa90abfe4c3 * λ (mod N) = 0
114ca50f7a8e2f3f657c1108d9d44cfd8 + 3086d221a7d46bcde86c90e49284eb15 * λ (mod N) = 0

Finally, the following equations hold as required:

λ^3 (mod N) = 1
ß^3 (mod P) = 1
λ^2 + λ + 1 (mod N) = 0

jimmysong force-pushed the 1 branch 2 times, most recently from c27a495 to 672ab72 Compare September 25, 2014 22:16

jimmysong force-pushed the 1 branch from 672ab72 to 8843597 Compare September 25, 2014 22:49

jimmysong force-pushed the 1 branch from 8843597 to 5daa225 Compare September 26, 2014 01:13

jimmysong force-pushed the 1 branch from 5daa225 to 5aecdcf Compare September 26, 2014 01:28

jimmysong force-pushed the 1 branch from 5aecdcf to b4cadcb Compare September 26, 2014 02:08

jimmysong force-pushed the 1 branch from b4cadcb to 7425341 Compare September 26, 2014 02:43

jimmysong force-pushed the 1 branch from 7425341 to 611c140 Compare September 26, 2014 13:58

jimmysong force-pushed the 1 branch from 611c140 to 4f948a9 Compare September 26, 2014 18:54

jimmysong force-pushed the 1 branch from 4f948a9 to 8848df1 Compare September 27, 2014 01:24

jimmysong force-pushed the 1 branch from 8848df1 to 386a945 Compare September 27, 2014 01:26

jimmysong force-pushed the 1 branch from 386a945 to 2206dff Compare January 22, 2015 16:07

jimmysong force-pushed the 1 branch from 2206dff to 69d8e09 Compare February 1, 2015 19:52

davecgh reviewed Feb 3, 2015
View reviewed changes

jimmysong force-pushed the 1 branch from 69d8e09 to 9fb8315 Compare February 3, 2015 14:22

jimmysong force-pushed the 1 branch from 9fb8315 to 3a91c2f Compare February 3, 2015 15:58

jimmysong force-pushed the 1 branch from 3a91c2f to 4fb1a9a Compare February 3, 2015 20:05

jimmysong force-pushed the 1 branch from 4fb1a9a to 95b23c2 Compare February 3, 2015 20:14

davecgh reviewed Feb 4, 2015
View reviewed changes

conformal-deploy merged commit 95b23c2 into btcsuite:master Feb 5, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize ScalarMult using endomorphism #8

Optimize ScalarMult using endomorphism #8

jimmysong commented Sep 25, 2014

coveralls commented Sep 25, 2014

coveralls commented Sep 25, 2014

coveralls commented Sep 25, 2014

coveralls commented Sep 25, 2014

jimmysong commented Sep 26, 2014

jimmysong commented Sep 26, 2014

coveralls commented Sep 26, 2014

jimmysong commented Sep 26, 2014

coveralls commented Sep 26, 2014

coveralls commented Sep 26, 2014

coveralls commented Sep 26, 2014

coveralls commented Sep 26, 2014

coveralls commented Sep 26, 2014

davecgh commented Sep 27, 2014

coveralls commented Sep 27, 2014

coveralls commented Sep 27, 2014

coveralls commented Jan 22, 2015

jimmysong commented Jan 22, 2015

coveralls commented Feb 1, 2015

jimmysong commented Feb 3, 2015

davecgh Feb 3, 2015

jimmysong commented Feb 3, 2015

jimmysong commented Feb 3, 2015

davecgh commented Feb 3, 2015

jimmysong commented Feb 3, 2015

coveralls commented Feb 3, 2015

davecgh commented Feb 3, 2015

davecgh Feb 4, 2015

davecgh commented Feb 4, 2015

Optimize ScalarMult using endomorphism #8

Optimize ScalarMult using endomorphism #8

Conversation

jimmysong commented Sep 25, 2014

coveralls commented Sep 25, 2014

coveralls commented Sep 25, 2014

coveralls commented Sep 25, 2014

coveralls commented Sep 25, 2014

jimmysong commented Sep 26, 2014

jimmysong commented Sep 26, 2014

coveralls commented Sep 26, 2014

jimmysong commented Sep 26, 2014

coveralls commented Sep 26, 2014

coveralls commented Sep 26, 2014

coveralls commented Sep 26, 2014

coveralls commented Sep 26, 2014

coveralls commented Sep 26, 2014

davecgh commented Sep 27, 2014

coveralls commented Sep 27, 2014

coveralls commented Sep 27, 2014

coveralls commented Jan 22, 2015

jimmysong commented Jan 22, 2015

coveralls commented Feb 1, 2015

jimmysong commented Feb 3, 2015

davecgh Feb 3, 2015

Choose a reason for hiding this comment

jimmysong commented Feb 3, 2015

jimmysong commented Feb 3, 2015

davecgh commented Feb 3, 2015

jimmysong commented Feb 3, 2015

coveralls commented Feb 3, 2015

davecgh commented Feb 3, 2015

davecgh Feb 4, 2015

Choose a reason for hiding this comment

davecgh commented Feb 4, 2015