-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: replace rug integer and upgrade curve operations #29
Comments
And it looks like we're able to overload operators between a U576 scalar and a hypothetical curve point without any issues: use crypto_bigint::{U576, Wrapping, Encoding};
use std::ops::Mul;
#[derive(Debug, Clone, Copy)]
struct CurvePoint {
x: U576,
y: U576,
}
impl Mul<U576> for CurvePoint {
type Output = Self;
fn mul(self, scalar: U576) -> Self {
Self {
x: (Wrapping(self.x) * Wrapping(scalar)).0,
y: (Wrapping(self.y) * Wrapping(scalar)).0,
}
}
}
fn main() {
let hex_string = "1FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF";
let buffer = push_bytes_into_buffer(hex_string);
// Convert the buffer to a U576 number
let number = U576::from_be_bytes(buffer);
let point = CurvePoint { x: number, y: number };
let scalar = U576::from_be_bytes(buffer);
let result = point * scalar;
println!("Result: {:?}", result);
} |
I made a special point of bit testing in our asks because it can be surprisingly non-trivial in some cases and crypto_bigint does not disappoint: This code explicitly calls the Into conversion for CtChoice, ensuring that the conversion is performed in a way that maintains constant-time guarantees: // simple test of true/false
if scalar.bit(0).into() {
println!("test bit = 0")
}
// To negate the same test, the compiler needs the type, so
// this is explicitly using the Into<bool> trait implementation for CtChoice
// to convert the result of scalar.bit(0) into a bool.
// This is a more explicit way of doing what .into() does above
// but the compiler demands it to force the explicit choice of a constant time operation
if !(<CtChoice as Into<bool>>::into(scalar.bit(0))) {
println!("test bit != 0")
} I just wanted to test some bits man Im not lookin for any trouble |
Moving to crypto bigint would also likely resolve this issue since we would no longer be relying on rug to determine the number of steps for the montgomery ladder. : rug: for i in (0..=s.bits()).rev() crypo_bigint offers the same functionality but will need to be tested: let num = x1.bits().reverse_bits(); with reverse bits being a core trait so need to figure out if it also has the leading zeros issue (confirmed it does not) continuing to rely on rug remains a security liability for this library. |
Update: it took awhile to parse crypto_bigint, the documentation leaves a lot to be desired in terms of usage examples, but lets see if we can start to fix up some of our problematic code. We have this block from the edwards curve add formula using rug: // (x₁y₂ + y₁x₂)
let x1y2 = (x1.clone() * y2.clone()) % p.clone();
let y1x2 = (y1.clone() * x2.clone()) % p.clone();
let x1y2y1x2_sum = (x1y2 + y1x2) % p.clone(); Can we fix this up with crypto_bigint? Step 1. Become familiar with wide multiplication. Lets define a modulus (this is actually the modulus from E448 in our library) using crypto_bigint impl_modulus macro: impl_modulus!(
Modulus,
U448, "FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF"
); Step 2: Lets define some numbers let a = U448::from_be_hex("FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF");
let b = U448::from_be_hex("FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFEFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF"); Step 3: square the modulus by using mul_wide on a and b: let c: (Uint<7>, Uint<7>) = a.mul_wide(&b); This gives us: c lower: 0000000000000000000000000000000000000000000000000000000200000000000000000000000000000000000000000000000000000001
c upper: FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFDFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF for squaring, we could also have used: let d = a.square_wide(); Step 4: reduce. This step is equivalent to ab mod p, but using the highly efficient mongtomery reduction: let reduction: Uint<7> = montgomery_reduction::<{ Modulus::LIMBS }>(
&c,
&Modulus::MODULUS,
Modulus::MOD_NEG_INV,
); resulting in: reduction: 0 which holds because p ^ 2 mod p is indeed zero Step 5: put it all together: // (x₁y₂ + y₁x₂)
let x1y2 = montgomery_reduction(&x1.mul_wide(&y2), &Modulus::MODULUS, Modulus::MOD_NEG_INV);
let y1x2 = montgomery_reduction(&y1.mul_wide(&x2), &Modulus::MODULUS, Modulus::MOD_NEG_INV);
let x1y2y1x2_sum = x1y2.add_mod(&y1x2, &Modulus::MODULUS); |
Observations:
|
to no one's surprise this has now become a literature review. as we tumble deeper down the rabbit hole, Ive started to convert our add forumla from affine form into twisted edwards form. This projects coordinate pairs from (x, y) into (x, y, z). The advantage of this is that it greatly simplifies the add formula and reduces the number of operations we need to carry out. We may consider just leaving all EC operations in projected form. We arent interacting with other libraries and so it might be advantageous to leave it this way. Anyways, Im going to proceed implementing this formula (page 12) : Addition in Projective Twisted Coordinates: |
This repo more clearly defines the strategy we need in order to take advantage fast addition and scalar multiplications: StrategyThe main strategy for group arithmetic on Ed448-Goldilocks is to perform the 2-isogeny to map the point to the Twisted-Goldilocks curve, then use the faster Twisted Edwards formulas to perform scalar multiplication. Computing the 2-isogeny then the dual isogeny will pick up a factor of 4 once we map the point back to the Ed448-Goldilocks curve, so the scalar must be adjusted by a factor of 4. Adjusting the scalar is dependent on the point and the scalar. More details can be found in the 2-isogenous paper. Steps
|
Update on required items to implement the above steps and isogenous transformations, following the patterns of dalek: define a scalar type:It should support all of the usual operations, plus:
Montgomery multiplication + reduction:this should do just fine, tried it with a few test cases and appears to produce correct results: montgomery_reduction(&a.mul_wide(&b), &Modulus::MODULUS, Modulus::MOD_NEG_INV); scalar mod 4 * Ppicked up this neat little trick: // Compute (scalar mod 4)
let s_mod_four = scalar[0] & 3; This works because & 3 zeroes out everything except the two leading bits which are the residues of a mod 4 operation. super cheap way to compute a remainder on a known power of 2 Field ElementFor this I have discovered the fiat_crypto crate, providing formally verified field arithmetic in constant time. We will use this for field elements instead of crypto-bigint for a few reasons:
Field Elements should have all of the usual operations which appear to be readily available. We're in too deep to give up nowgood problems tend to explode into larger ones. this is way farther than I got last time I tried this and the solution seems within reach last note, in my blind stupor towards building this out Ive abandoned the other curves and am focusing solely on E448. |
Due a windows build problem with gmp-mpfr-sys, Ive gone ahead and removed rug and replaced it with num-bigint. As I feared, this has caused a substantial regression in the performance of our asymmetric operations:
which hopefully illustrates our motivation to grab rug in the first place. |
Update: Brain mush viscosity is somewhere between banana pudding and that milk in the fridge thats definitely expired but I feel too guilty to throw away. Good news:Basically just co-opting large parts of this excellent crate, we've managed to get E448 largely up and running with To illustrate how dramatic the difference is between our original correct but naive approach, This: pub fn add(mut self, p2: &EdCurvePoint) -> EdCurvePoint {
let x1 = &self.x;
let y1 = &self.y;
let x2 = p2.x.clone();
let y2 = &p2.y;
let p = self.p.clone();
let d = self.d.clone();
// (x₁y₂ + y₁x₂)
let x1y2 = x1.clone() * y2.clone();
let y1x2 = y1.clone() * x2.clone();
let x1y2y1x2_sum = x1y2 + y1x2;
// 1 / (1 + dx₁x₂y₁y₂)
let one_plus_dx1x2y1y2 = (Integer::from(1)
+ (d.clone() * x1.clone() * x2.clone() * y1.clone() * y2.clone()))
% p.clone();
let one_plus_dx1x2y1y2inv = mod_inv(&one_plus_dx1x2y1y2, &p);
// (y₁y₂ − x₁x₂)
let y1y2x1x2_difference = (y1.clone() * y2.clone()) - (x1.clone() * x2.clone());
// 1 / (1 − dx₁x₂y₁y₂)
let one_minus_dx1x2y1y2 = (Integer::from(1) - (d * x1 * x2 * y1 * y2)) % p.clone();
let one_minus_dx1x2y1y2inv = mod_inv(&one_minus_dx1x2y1y2, &p);
// (x₁y₂ + y₁x₂) / (1 + dx₁x₂y₁y₂)
let new_x = ((x1y2y1x2_sum * one_plus_dx1x2y1y2inv) % p.clone() + p.clone()) % p.clone();
// (y₁y₂ − x₁x₂) / (1 − dx₁x₂y₁y₂)
let new_y = ((y1y2x1x2_difference * one_minus_dx1x2y1y2inv) % p.clone() + p.clone()) % p;
self.x = new_x;
self.y = new_y;
self
} Has become this: pub fn add(&self, other: &ExtendedCurvePoint) -> ExtendedCurvePoint {
let aXX = self.X * other.X;
let dTT = EDWARDS_D * self.T * other.T;
let ZZ = self.Z * other.Z;
let YY = self.Y * other.Y;
let X1Y2_plus_Y1X2 = (self.X * other.Y) + (self.Y * other.X);
let X = X1Y2_plus_Y1X2 * (ZZ - dTT);
let Y = (YY - aXX) * (ZZ + dTT);
let T = (YY - aXX) * X1Y2_plus_Y1X2;
let Z = (ZZ - dTT) * (ZZ + dTT);
ExtendedCurvePoint { X, Y, Z, T }
} This is a pretty kickass development considering that we've reached basically all of our original design goals:
The following test cases succeed:
Bad News:There is none. This work is super cool and rewarding. Whats left:
|
Update: We are 3 test cases away from complete successThese test cases are passing: 0 * G = 𝒪 leaving the following left to fix: (k + t) * G = (k * G) + (t * G) Observations:
s &= !0b11; Preparing for merge:
|
WE DID ITAll tests pass: running 11 tests
test e448_tests::test_g_plus_neg_g ... ok
test e448_tests::test_zero_times_g ... ok
test e448_tests::test_four_g_not_id ... ok
test e448_tests::test_two_times_g ... ok
test e448_tests::test_g_times_one ... ok
test e448_tests::test_four_g ... ok
test e448_tests::k_g_equals_k_mod_r_times_g ... ok
test e448_tests::k_t ... ok
test e448_tests::r_times_g_id ... ok
test e448_tests::k_plus_one_g ... ok
test e448_tests::test_ktp ... ok The problem was with the implementations for the scalar type, do remember that modding scalar operations is done with the curve order r, not the field prime p. We provide a slightly modified version of curve25519-dalek/ variable base, fixed time multiplication found here, and also here, the only difference being that we operate on u64 limbs to be compatible with the U448 type. Changing the number of limbs required carefully ensuring the radix_16 recentering step was correct, as well as the modular reduction step for scalar multiplication. Rug and BigInt are gone:Ive gone ahead and memorialized this momentous achievement: What this means for the library:
fn add(mut self, p2: &EdCurvePoint) -> EdCurvePoint {
let x1 = &self.x;
let y1 = &self.y;
let x2 = p2.x.clone();
let y2 = &p2.y;
let p = self.p.clone();
let d = self.d.clone();
// (x₁y₂ + y₁x₂)
let x1y2 = x1.clone() * y2.clone();
let y1x2 = y1.clone() * x2.clone();
let x1y2y1x2_sum = x1y2 + y1x2;
// 1 / (1 + dx₁x₂y₁y₂)
let one_plus_dx1x2y1y2 = (Integer::from(1)
+ (d.clone() * x1.clone() * x2.clone() * y1.clone() * y2.clone()))
% p.clone();
let one_plus_dx1x2y1y2inv = mod_inv(&one_plus_dx1x2y1y2, &p);
// (y₁y₂ − x₁x₂)
let y1y2x1x2_difference = (y1.clone() * y2.clone()) - (x1.clone() * x2.clone());
// 1 / (1 − dx₁x₂y₁y₂)
let one_minus_dx1x2y1y2 = (Integer::from(1) - (d * x1 * x2 * y1 * y2)) % p.clone();
let one_minus_dx1x2y1y2inv = mod_inv(&one_minus_dx1x2y1y2, &p);
// (x₁y₂ + y₁x₂) / (1 + dx₁x₂y₁y₂)
let new_x = ((x1y2y1x2_sum * one_plus_dx1x2y1y2inv) % p.clone() + p.clone()) % p.clone();
// (y₁y₂ − x₁x₂) / (1 − dx₁x₂y₁y₂)
let new_y = ((y1y2x1x2_difference * one_minus_dx1x2y1y2inv) % p.clone() + p.clone()) % p;
} To this: pub fn add_extended(&self, other: &ExtendedPoint) -> ExtensibleCurvePoint {
let A = self.X * other.X;
let B = self.Y * other.Y;
let C = self.T1 * self.T2 * other.T * TWISTED_D;
let D = self.Z * other.Z;
let E = (self.X + self.Y) * (other.X + other.Y) - A - B;
let F = D - C;
let G = D + C;
let H = B + A;
}
Whats left:
This might need to go into it's own crate eventually. We dont want it to be too tightly coupled to the cryptosystem functions. |
It all works (victory)All asymmetric operations have now been reconfigured to use the upgraded curve. All tests pass. What this change takes away from the library:Its worth making a quick note about things we were forced to do to get to this point. One of the nice things about rug is that it can store any bitsize, which lead to the wide variety of edwards curves we originally had available to us. In fact, the only reason we were able to replace the coordinate types with fiat-crypto tight field elements was specifically because of the P448 type available. Without this, we would need to use the U448 type in crypto-bigint which wouldnt be a terrible choice at all, but crypto-bigint also has limited support for the funky bitsizes youll commonly see in exotic math objects like elliptic curves. Thus, we lose support for the following:
Im unsure if we can bring them back at some point. Arbitrary precision types with odd bitsizes have an extra layer of difficulty in a homebrew solution, and unless the fiat-crypto authors or someone who contributes to it brings them to life, we are constrained to the types available to us. |
All tests passing, merging to main |
This crate
usesused the Rug integer FFI to the GMP library. This gives us blazing fast speeds on integer calculations, but it comes at the cost of not being able to effectively control ownership of the integers when they move out of scope during normal arithmetic operations. Here's an example of some of our problematic code from the edwards curveadd
formula:Whats happening?
Because every operation *, %, etc is technically a call to a function, the value we send it is going out of scope and thus ownership has changed. as an FFI, rug doesnt have ownership rules in the same way that native rust types do. So the values are dropped when we operate on them. heres an example:
The problem is that when you are multiplying a curve point by a scalar value that can be 64 bits long in the secure case, these deep copies add up quickly, and frankly seeing so many clones is kind of a bummer and means maybe you arent getting along with the borrow checker so well.
We need an arbitrary precision library that:
Why point 5? because i have a burning hole in my heart that can only be filled overloaded operators
Enter the crypto_bigint and fiat_crypto crates. Points 1 and 2 are addressed trivially. 3 and 4 and 5 will require investigation which is the whole point of this ticket. Replacing the rug crate
mightwill require significant overhaul of the EC point operations and probably the entire library. But hey what else is new. This will enhance the library by:It might be a doozy but its well-motivated and worth the effort in my opinion.
The text was updated successfully, but these errors were encountered: