Optimizations derived from armfazh/flor-sidh-x64 project.#2
Optimizations derived from armfazh/flor-sidh-x64 project.#2grittygrease merged 4 commits intocloudflarearchive:masterfrom
Conversation
Look at optimizations.md file.
vkrasnov
left a comment
There was a problem hiding this comment.
Very good work, I only have a few nits.
optimizations.md
Outdated
| @@ -0,0 +1,37 @@ | |||
|
|
|||
There was a problem hiding this comment.
This file should not be part of the PR
p751toolbox/curve.go
Outdated
| } | ||
|
|
||
| // Given xP = x(P), xQ = x(Q), and xPmQ = x(P-Q), compute xPaddQ = x(P+Q) and x2P = x(2P). | ||
| func xDblAdd(curve *CachedCurveParameters, xP, xQ, xPmQ *ProjectivePoint) (x2P, xPaddQ ProjectivePoint) { |
There was a problem hiding this comment.
I think that should probably be a method of *CachedCurveParameters
There was a problem hiding this comment.
Since this function operates over points, thus now it is a method of ProjectivePoint and ProjectivePrimeFieldPoint.
Good suggestion.
p751toolbox/curve.go
Outdated
|
|
||
| // Given xP = x(P), xQ = x(Q), and xPmQ = x(P-Q), compute xPaddQ = x(P+Q) and x2P = x(2P). | ||
| // Assumes that the Z-xoordinate of PmQ is equal to 1. | ||
| func xDblAdd_primefield(aPlus2Over4 *PrimeFieldElement, xP, xQ, xPmQ *ProjectivePrimeFieldPoint) (x2P, xPaddQ ProjectivePrimeFieldPoint) { |
p751toolbox/curve.go
Outdated
| // Given xP = x(P), xQ = x(Q), and xPmQ = x(P-Q), compute xPaddQ = x(P+Q) and x2P = x(2P). | ||
| // Assumes that the Z-xoordinate of PmQ is equal to 1. | ||
| func xDblAdd_primefield(aPlus2Over4 *PrimeFieldElement, xP, xQ, xPmQ *ProjectivePrimeFieldPoint) (x2P, xPaddQ ProjectivePrimeFieldPoint) { | ||
| var A, AA, B, BB, C, D, E, DA, CB, t0, t1 PrimeFieldElement |
There was a problem hiding this comment.
You could reuse temporary variables more efficiently, instead having so many of them
There was a problem hiding this comment.
Now DblAdd uses more efficiently the auxiliary variables.
p751toolbox/curve.go
Outdated
| CB.Mul(&C, &B) // CB = C*B | ||
|
|
||
| t1.Add(&DA, &CB) // t1 = DA+CB | ||
| t0.Sub(&DA, &CB) // t0 = DA-CB |
There was a problem hiding this comment.
For example reuse DA, instead t0, or even right into z5 (xPaddQ.X?) to avoid copying.
p751toolbox/curve.go
Outdated
| t1.Square(&t1) // t1 = t1^2 | ||
| t0.Square(&t0) // t0 = t0^2 | ||
|
|
||
| xPaddQ.X = t1 // z5 = z1*t1 |
There was a problem hiding this comment.
not sure if z and x in comments are mixed, or is it on purpose?
p751toolbox/curve.go
Outdated
| } | ||
|
|
||
| /** | ||
| Update: This is the right-to-left method for computing the x-coordinate of P+[k]Q. |
There was a problem hiding this comment.
Update is not a valid comment IMO. I would modify the existing description.
p751toolbox/print.go
Outdated
| package p751toolbox | ||
|
|
||
| import ( | ||
| "reflect" |
There was a problem hiding this comment.
You don't need reflect here. Please avoid it at all costs.
There was a problem hiding this comment.
reflect package dependency removed.
p751toolbox/print.go
Outdated
| func (element Fp751Element) String() string { | ||
| var out [94]byte | ||
| element.toBytesFromMontgomeryForm(out[:]) | ||
| return fmt.Sprintf("%s",PrintHex(out[:])) |
There was a problem hiding this comment.
Inconsistent commas. Did you apply gofmt?
- optimizations.md file removed. - DblAdd function now is a method of ProjectivePoint and ProjectivePrimeFieldPoint. - A better (re)utilization of variables inside of DblAdd method. - Reflect package is not required anymore in p751toolbox/print.go.
Hi cloudflare,
This is PR contains some optimizations derived from flor-sidh-x64 project. Basically, this PR shows the impact of the right-to-left algorithm in the calculation of shared secret function. Also, it contains a faster formula for computing point triplings, i.e. given a point P computes [3]P.
Please consider this PR as an improvement on the performance of your Go implementation.
Best,
General optimizations
The following are general and well-known optimizations.
This reduces 1M per iteration in 3-point ladder and 2M per iteration in ScalarMult function.
Optimizations derived from FLOR-SIDH-x64
The following are specific optimizations based on the FLOR-SIDH-x64 work.
Benchmark Comparison
For Shared Secret computation, the execution time was reduced by 9.5%.