Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add point compression + canonical serialization for AffineCurve elements #51

Closed
Pratyush opened this issue Jan 8, 2020 · 14 comments · Fixed by #73
Closed

Add point compression + canonical serialization for AffineCurve elements #51

Pratyush opened this issue Jan 8, 2020 · 14 comments · Fixed by #73

Comments

@Pratyush
Copy link
Member

Pratyush commented Jan 8, 2020

This is one of the critical missing pieces for this library. A sketch of the design I have in mind is to make CanonicalSerialize and CanonicalDeserialize traits:

pub trait CanonicalSerialize {
	type Output: Iterator<Item = u8>
	fn serialize(&self) -> Output;
}

pub trait CanonicalDeserialize: CanonicalSerialize {
	fn deserialize(&Self::Output) -> Self;
}

The idea would then be to implement these directly for the final instantiated curves, e.g. for Bls12_381_G1. (I don't see an easy way to have a generic implementation that works for all base fields.)

@Pratyush Pratyush mentioned this issue Jan 8, 2020
18 tasks
@kobigurk
Copy link
Contributor

kobigurk commented Jan 8, 2020

We have an implementation of point compression/decompression for G1/G2 in circuits as well. It requires some new comparison gadgets though, that's why I didn't make a PR yet. Let me see about that!

@Pratyush
Copy link
Member Author

Pratyush commented Jan 8, 2020

Awesome! Would love to see that =)

@xu-cheng
Copy link
Contributor

Just want to ask would it be better to implement serialization and deserialization in serde? Not only it is the rust de facto standard, it also has good support on no_std.

Similarly, the ToBytes and FromBytes could also be replaced with serde implementation to support no_std.

@Pratyush
Copy link
Member Author

I think adding serde support behind a flag would be great. However, I would still like to see our own fine-grained serialization traits because these would be independent of the serialization format. The serde impls would then forward to these traits.

ToBytes and FromBytes are different in that they are intended for cheap in-memory representations, not for serialization for storage or sending across the network. I do intend to change these to support no_std, just trying to figure out the best way to do that...

@xu-cheng
Copy link
Contributor

because these would be independent of the serialization format.

Just want to mention serde is format agnostic.

I do intend to change these to support no_std, just trying to figure out the best way to do that...

Maybe use something like https://docs.rs/scroll/0.10.1/scroll/

@Pratyush
Copy link
Member Author

I think we also need a couple more knobs than are provided by serde (example: generically packing the y-coordinate bit into the x-coordinate during compression), so that's why I'd like our own traits. But I'll admit that I haven't tried to hash out a design in detail yet, so I'm open to brainstorming =)

Maybe use something like docs.rs/scroll/0.10.1/scroll

That looks almost exactly like what I had in mind, thanks!

@xu-cheng
Copy link
Contributor

example: generically packing the y-coordinate bit into the x-coordinate during compression

Just FYI, I don't see why this cannot be done in serde. For example, there is the serde implementation with point compression: https://github.com/filecoin-project/pairing/blob/master/src/bls12_381/serde_impl.rs

@Pratyush
Copy link
Member Author

@Pratyush
Copy link
Member Author

Pratyush commented Jan 21, 2020

Ok expanding on my initial design, I was thinking of creating the following traits:

pub trait CanonicalSerialize {
	type Error;
	fn serialize(&self, extra_info: &[bool], output_buf: &mut [u8]) -> Result<(), Self::Error>;
}

pub trait CanonicalDeserialize: CanonicalSerialize {
	fn deserialize(bytes: &[u8], extra_info_buf: &mut [bool]) -> Result<Self, Self::Error> ;
}

For field elements, this could be implemented as follows:

impl<P: Fp256Parameters> CanonicalSerialize for Fp256<P> {
	type Error = Something;
	fn serialize(&self, extra_info: &[bool], output_buf: &mut [u8]) {
		let bytes = [0u8; 32];
		self.write(&mut bytes); // From the ToBytes impl
		if extra_info.len() > (256 - P::MODULUS_BITS) {
			return Err(Error::NotEnoughSpace);
		} else {
			// Write bits in `extra_info` to empty space in 
			// last byte of `bytes`.
			// Write `bytes` to `output_buf`.
		}
	}
}

pub trait CanonicalDeserialize: CanonicalSerialize {
	fn deserialize(input: &[u8], extra_info_buf: &mut [bool]) -> Result<Self, Self::Error> {
		// 1. Read 32 bytes from `input` into `bytes`.
		// 2. if extra_info_buf.len() > (256 - P::MODULUS_BITS), return Err(Error::NotEnoughSpace);
		// 3. Extract top `extra_info_buf.len()` bits from `bytes`, and then zero out the corresponding bits in the last byte of `bytes`.
		// 4. Invoke `FromBytes::read` on the remaining bytes to get a field element.
	}
}

This impl can be used by AffineCurves via the following impl:

impl<P: SWParameters> CanonicalSerialize for SWAffine<P> {
	type Error = Something;
	fn serialize(&self, extra_info: &[bool], output_buf: &mut [u8]) {
		// 1. If `self.is_zero()`, output `Self::BaseField::zero().serialize(&[0, 0])`.
		// 2. Else, check if y-coordinate is lexicographically larger.
		//    If yes, then output `self.x.serialize(&[1, 0])`.
		//    If no, then output `self.x.serialize(&[0, 0])`.
	}
}

pub trait CanonicalDeserialize: CanonicalSerialize {
	fn deserialize(input: &[u8], extra_info_buf: &mut [bool]) -> Result<Self, Self::Error> {
		// Inverse of above.
	}
}

This serialization scheme requires MODULUS_BITS to be at most 2 less than a multiple of 64. (It might be possible to do with 1 less than multiple of 64, I was looking at the bellman serialization scheme while writing this)

@kobigurk
Copy link
Contributor

Ok expanding on my initial design, I was thinking of creating the following traits:

pub trait CanonicalSerialize {
	type Error;
	fn serialize(&self, extra_info: &[bool], output_buf: &mut [u8]) -> Result<(), Self::Error>;
}

pub trait CanonicalDeserialize: CanonicalSerialize {
	fn deserialize(bytes: &[u8], extra_info_buf: &mut [bool]) -> Result<Self, Self::Error> ;
}

For field elements, this could be implemented as follows:

impl<P: Fp256Parameters> CanonicalSerialize for Fp256<P> {
	type Error = Something;
	fn serialize(&self, extra_info: &[bool], output_buf: &mut [u8]) {
		let bytes = [0u8; 32];
		self.write(&mut bytes); // From the ToBytes impl
		if extra_info.len() > (256 - P::MODULUS_BITS) {
			return Err(Error::NotEnoughSpace);
		} else {
			// Write bits in `extra_info` to empty space in 
			// last byte of `bytes`.
			// Write `bytes` to `output_buf`.
		}
	}
}

pub trait CanonicalDeserialize: CanonicalSerialize {
	fn deserialize(input: &[u8], extra_info_buf: &mut [bool]) -> Result<Self, Self::Error> {
		// 1. Read 32 bytes from `input` into `bytes`.
		// 2. if extra_info_buf.len() > (256 - P::MODULUS_BITS), return Err(Error::NotEnoughSpace);
		// 3. Extract top `extra_info_buf.len()` bits from `bytes`, and then zero out the corresponding bits in the last byte of `bytes`.
		// 4. Invoke `FromBytes::read` on the remaining bytes to get a field element.
	}
}

This impl can be used by AffineCurves via the following impl:

impl<P: SWParameters> CanonicalSerialize for SWAffine<P> {
	type Error = Something;
	fn serialize(&self, extra_info: &[bool], output_buf: &mut [u8]) {
		// 1. If `self.is_zero()`, output `Self::BaseField::zero().serialize(&[0, 0])`.
		// 2. Else, check if y-coordinate is lexicographically larger.
		//    If yes, then output `self.x.serialize(&[1, 0])`.
		//    If no, then output `self.x.serialize(&[0, 0])`.
	}
}

pub trait CanonicalDeserialize: CanonicalSerialize {
	fn deserialize(input: &[u8], extra_info_buf: &mut [bool]) -> Result<Self, Self::Error> {
		// Inverse of above.
	}
}

This serialization scheme requires MODULUS_BITS to be at most 2 less than a multiple of 64. (It might be possible to do with 1 less than multiple of 64, I was looking at the bellman serialization scheme while writing this)

Thanks!
I think bellman uses 2 bits to indicate whether it's a point in infinity (I think it uses even 3, to indicate it's a compressed point).

Some questions:

  • What about making output_buf a writer instead?
  • What's the use of extra_info?

@Pratyush
Copy link
Member Author

Pratyush commented Jan 21, 2020

What about making output_buf a writer instead?

Sure, we could do that, I was trying to avoid introducing no_std incompatibility, which is why I went with the u8-oriented interface.

What's the use of extra_info?

extra_info is used primarily to communicate, in a generic manner, information about the extra bits that need to be serialized from the curve-impl to the field-impl. So for example when serializing a curve point, extra_info would be &[0, 0] if the input was the point at infinity, &[1, 0] if it was not the point at infinity and the y coordinate was the lexicographically larger one, and &[0, 0] (again) otherwise.

@kobigurk
Copy link
Contributor

What about making output_buf a writer instead?
Sure, we could do that, I was trying to avoid introducing no_std incompatibility, which is why I went with the u8-oriented interface.

What's the use of extra_info?
extra_info is used primarily to communicate, in a generic manner, information about the extra bits that need to be serialized from the curve-impl to the field-impl. So for example when serializing a curve point, extra_info would be &[0, 0] if the input was the point at infinity, &[1, 0] if it was not the point at infinity and the y coordinate was the lexicographically larger one, and &[0, 0] (again) otherwise.

Both considerations sound good to me.

@Pratyush
Copy link
Member Author

@kobigurk If you'd like to take a stab at this let me know; otherwise I'll try to push an initial branch over the next few days

@xu-cheng
Copy link
Contributor

Just my two cents. I think that it would be better to eliminate unnecessary copies. In your example above, there are three rounds of copies: (1) copy in self.write(&mut bytes), (2) copy from bytes to output_buf, and (3) copy from output_buf to serde, when serde support is used. This is exactly why I recommend to use serde directly. It supports no_std, has read/write interface, and can eliminate unnecessary data copies. You can read https://serde.rs/custom-serialization.html on how to implement (de)serialization in serde.

My second issue is about extra_info. How should it be passed? Note that usually there is no way to pass additional argument when (de)serializing data. Thinking (de)serializing multiple points in a json file. Also, the way to use it like &[0, 0], &[0, 1] is quite unintuitive. If you really need it, would an enum be better choice?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants