Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crypto_box: Add optional serde support #27

Merged
merged 5 commits into from Jan 12, 2022

Conversation

threema-danilo
Copy link
Contributor

@threema-danilo threema-danilo commented Jan 10, 2022

The serde feature is optional, but enabled by default. (Could be made off-by-default if desired, not sure if we have a batteries-included or a minimal-by-default policy here.)

I used the serde-crate-alias trick, inspired by RustCrypto/RSA#41

For a round-trip serialization test, I added "bincode" and "rand" dev-dependencies.

@tarcieri
Copy link
Member

I'd definitely prefer for serde support to be off-by-default

@threema-danilo threema-danilo changed the title crypto_box: Add serde support to PublicKey crypto_box: Add optional serde support Jan 10, 2022
@threema-danilo
Copy link
Contributor Author

@tarcieri updated!

@tarcieri
Copy link
Member

tarcieri commented Jan 10, 2022

It'd be nice to avoid the serde_derive dependency here, especially as it's writing impls for a single struct.

We've generally used handwritten Deserialize/Serialize impls in other projects but it may be possible to do with declarative attributes.

@threema-danilo
Copy link
Contributor Author

True. Might not be relevant for server applications (where serde_derive will most probably already be part of the dependency tree), but for minimal implementations (e.g. no_std). I'll update the PR.

@tarcieri
Copy link
Member

Here's an example from another crate of ours if this helps: https://github.com/RustCrypto/traits/blob/master/elliptic-curve/src/public_key.rs#L356-L392

@threema-danilo
Copy link
Contributor Author

threema-danilo commented Jan 10, 2022

Updated. I implemented two variants, first I used (ser|de)serialize_newtype_struct, but then I thought that the fact that PublicKey is a newtype probably doesn't matter for serialization, so I switched to the simpler approach of serializing the bytes directly as tuple. If you want, I can also squash the two commits (or you squash them when merging).

Note that my familiarity with serde internals and manual ser/de impls is rather superficial, so I hope this does the right thing 🙂 Round-trip serialization works, and the bincode looks reasonable (32 plain bytes without a length prefix or tag).

@tarcieri
Copy link
Member

tarcieri commented Jan 10, 2022

so I switched to the simpler approach of serializing the bytes directly as tuple.

My understanding is this approach works well for bincode but results in a lot of overhead for e.g. CBOR.

We should probably try to solve these concerns in a single place within the RustCrypto project, but my best understanding is a byte slice is the most flexible option.

@tarcieri
Copy link
Member

Opened an issue about common serde support: RustCrypto/traits#880

@threema-danilo
Copy link
Contributor Author

Alright, I added another commit that switches to a slice based approach. The bincode overhead for this is 8 bytes, plus we need to do length checking (but otherwise that would be done by serde, so that's fine).

With the `serde` feature, serialization and deserialization is
implemented for `PublicKey`.
@tarcieri
Copy link
Member

The CBOR overhead of a tuple is something ridiculous like turning each byte into 4 or 8-bytes, so the unnecessary length prefix in bincode is small by comparison.

@threema-danilo
Copy link
Contributor Author

Here's the error message when deserializing a public key from a 4-byte bincode array:

Custom("invalid length 4, expected a 32-byte public key")

Code:

let deserialized: PublicKey = bincode::deserialize(&[4, 0, 0, 0, 0, 0, 0, 0, 1, 2, 3, 4]).unwrap();

@threema-danilo
Copy link
Contributor Author

The CBOR overhead of a tuple is something ridiculous like turning each byte into 4 or 8-bytes, so the unnecessary length prefix in bincode is small by comparison.

Yeah, serde_json does something similar (I think it's {0: 0xf0, 1: 0x0b, 2: 0xaa, ... }).

@tarcieri
Copy link
Member

One of the things I mentioned on RustCrypto/traits#880 is the ability to generate serializers/deserializers which use a different representation for human readable formats, e.g. hex.

You can see some examples of that here: https://github.com/RustCrypto/traits/blob/559eb9e349884ff53db032db9bff7044b9d73dad/elliptic-curve/src/scalar/core.rs#L405-L464

@threema-danilo
Copy link
Contributor Author

Ah, yeah, I saw you mentioned the human readable format thing. I wasn't aware that it's that easy to implement, I'll add this tomorrow 🙂

@threema-danilo
Copy link
Contributor Author

Actually, it's not as easy, because currently there's no method of hex encoding/decoding a slice in the library. And I'm not sure if it should be the responsibility of this library (as a user, you can access the bytes and encode it yourself).

In the case of ScalarCore (linked above) it's only a single uint which is serialized using the Display impl. That's easy. If we want to do this efficiently for a 32-byte slice, then we'd probably need to bring in a library like data_encoding or base16.

If we do that, then a few questions arise:

  • Is there a preferred library in the RustCrypto ecosystem for doing this?
  • Should that library be optional? If yes, that would actually cause compatibility issues, because if crypto_box is compiled with that feature flag on one computer, and without it on another computer, then a public key serialized with a human readable serializer cannot be deserialized on the other computer without adding some content format detection hacks.

I'd either go with "standardize on a non-optional hex encoding dependency" or "don't do any specialization for human readable serializers at all".

(By the way, the ScalarCore serialization has a similar compatibility issue, a value serialized with a human-readable serializer on a system with alloc enabled cannot be deserialized on another system without alloc.)

@tarcieri
Copy link
Member

tarcieri commented Jan 11, 2022

Is there a preferred library in the RustCrypto ecosystem for doing this?

Really before we proliferate this too much, I'd prefer to make a hexct crate similar to the base64ct so we could potentially handle secret keys with it. And ideally, I think it'd be nice to get a single solution for Display/FromStr impls in addition to serde, solved consistently in a common place. So this all goes back to RustCrypto/traits#880.

Should that library be optional? If yes, that would actually cause compatibility issues, because if crypto_box is compiled with that feature flag on one computer, and without it on another computer, then a public key serialized with a human readable serializer cannot be deserialized on the other computer without adding some content format detection hacks.

I think it would make sense for it to always be included when serde support is enabled.

(By the way, the ScalarCore serialization has a similar compatibility issue, a value serialized with a human-readable serializer on a system with alloc enabled cannot be deserialized on another system without alloc.)

That's just a deficiency of the current implementation. It could be corrected. The serializer is based on Display, so it's using that impl to thunk to serde via ToString.

All that said, my main suggestion would be to just stick with binary serializers for now and circle back on human readable format detection in some common solution like RustCrypto/traits#880.

@newpavlov
Copy link
Member

Really before we proliferate this too much, I'd prefer to make a hexct crate similar to the base64ct so we could potentially handle secret keys with it.

Is it needed though? I think all hex encoder/decoder implementations use code based on simple subtraction/addition/shifts without branches on processed data.

@tarcieri
Copy link
Member

tarcieri commented Jan 11, 2022

Nope! The hex crate uses a LUT: https://github.com/KokaKiwi/rust-hex/blob/e8b40e42/src/lib.rs#L88

...and heavy branching: https://github.com/KokaKiwi/rust-hex/blob/e8b40e42/src/lib.rs#L176-L184

@threema-danilo
Copy link
Contributor Author

Yeah, for secret keys you'd need a constant-time implementation. It probably makes sense to not let a LUT-based hex encoding crate into the dependency list at all (even if it's currently only used for public keys).

All that said, my main suggestion would be to just stick with binary serializers for now and circle back on human readable format detection in some common solution like RustCrypto/traits#880.

I agree. In that case, this PR can be merged, right?

@threema-danilo
Copy link
Contributor Author

Hm, don't merge this yet, it seems that msgpack roundtrip-serialization with rmp-serde seems to fail... I first need to find out why.

---- tests::test_public_key_serialization stdout ----
thread 'tests::test_public_key_serialization' panicked at 'Public key could not be deserialized: Syntax("invalid type: sequence, expected a borrowed byte array")', crypto_box/src/lib.rs:490:48

@threema-danilo
Copy link
Contributor Author

I think I found the reason. I initially implemented deserialization like this.

// Deserialize slice
let slice = <&[u8]>::deserialize(deserializer)?;

// Convert to array (with length check)
let array: [u8; KEY_SIZE] = slice
    .try_into()
    .map_err(|_| Error::invalid_length(slice.len(), &"a 32-byte public key"))?;

Ok(PublicKey::from(array))

Since we deserialize into a slice, that means that this will only work for deserializers that can borrow the serialized data directly. That's the case for bincode, and I think for CBOR as well. The msgpack deserializer doesn't seem to support this.

I changed the deserializer to use a SeqAccess visitor. This way the serialized data is processed once, and then copied into a newly allocated byte array, which is then wrapped in a PublicKey.

I pushed a commit with the rmp roundtrip test as well, but I can also remove that commit from the PR if you prefer fewer dev dependencies.

This allows serializers to use a more efficient byte array
representation. Otherwise sequence based serialization would be used.
@tarcieri
Copy link
Member

tarcieri commented Jan 11, 2022

That's unfortunate. I wonder if there's an actual limitation of the msgpack format that requires that, or if it's just a missing feature of the decoder.

(Also: issues like this are all the more reason why it'd be good to solve serialization centrally in one place)

@tarcieri tarcieri merged commit ddc8624 into RustCrypto:master Jan 12, 2022
@threema-danilo threema-danilo deleted the serde branch January 12, 2022 07:57
@tarcieri tarcieri mentioned this pull request Jan 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants