Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement serde Serialize and Deserialize for relevant types #192

Closed
vlmutolo opened this issue Mar 27, 2021 · 7 comments · Fixed by #193
Closed

Implement serde Serialize and Deserialize for relevant types #192

vlmutolo opened this issue Mar 27, 2021 · 7 comments · Fixed by #193
Assignees
Labels
new feature New feature or request
Milestone

Comments

@vlmutolo
Copy link
Contributor

vlmutolo commented Mar 27, 2021

Summary

The serde::Serialize and serde::Deserialize traits are used pervasively throughout the Rust ecosystem to enable flexible (de)serialization of data. Many authors derive these types on structs using macros provided by serde. If orion's types don't support these traits, users won't be able to derive these serialization traits, and will have to resort to converting orion types before storing them in a struct for serialization.

By implementing serde's standard traits, we enable authors to skip the mental burden and increased complexity of serializing orion types without serde.

Scope

At least for a first implementation, I would suggest only implementing serialization traits for types that are expected to either be stored in a database or sent to another application (e.g. over the network). This would include the following types.

  • pwhash::PasswordHash
  • hash::Digest
  • auth::Tag
  • kdf::Salt (added after suggestion below)

Motivating Example

My motivating example is storing a User type in a database. It would be nice to simply define a User struct, and then have it serialize to one long string using, for example, the bincode crate. This is what I would like the basic code to look like (using a fictional database API).

use serde::{Serialize, Deserialize};
use orion::pwhash::PasswordHash;

#[derive(Serialize, Deserialize)]
struct User {
    username: String,
    pwhash: PasswordHash,
    last_login: chrono::DateTime<Utc>,
}

fn store_user_in_db(user: &User, db: &Db) {
    db.write_bytes(&user.username, bincode::serialize(user));
}

fn retrieve_user_from_db(db: &Db) -> User {
    let bytes = db.get_bytes(&user.username);
    bincode::deserialize(bytes)
}

The PasswordHash type would serialize as a string by delegating to PasswordHash::unprotected_as_encoded.

Without serde's types implemented, however, we would have to do one of the following.

  1. Create a separate User type that can be serialized.
  2. Implement serde manually on User and delegate the PasswordHash serialization to its unprotected_as_encoded method. Neither solution is particularly attractive.

A separate, serializable User struct is probably the easiest of the the alternatives. It would look like this.

// previous definitions omitted

#[derive(Serialize, Deserialize)]
struct UserSerialize {
    username: String,
    pwhash: String,
    last_login: chrono::DateTime<Utc>,
}

fn store_user_in_db(user: &User {username, pwhash, last_login}, db: &Db) {
    let user_serialize = UserSerialize {username, last_login, pwhash: pwhash.unprotected_as_encoded()};
    db.write_bytes(&user.username, bincode::serialize(to_serialize));
}

fn retrieve_user_from_db(db: &Db) -> User {
    let bytes = db.get_bytes(&user.username);
    let UserSerialize {username, pwhash, last_login} = bincode::deserialize(bytes);
    User {username, last_login, pwhash: PasswordHash::from_encoded(pwhash)}
}

It isn't terrible in terms of boilerplate, but it could start to get annoying and possibly error-prone to repeat struct definitions for several different types instead of just User.

Implementing Serialize and Deserialize manually can be… unpleasant. Especially for anything complex, and definitely more boilerplate in any case.

Pros of implementing Serialize/Deserialize

  • There is significant ergonomic benefit to supporting this nearly ubiquitous pair of serialization traits.
  • Users may be less likely to mistakenly reach for non-constant-time operations if they never actually (de)serialize the type themselves and just let serde do it for them. The common workflow is to deserialize some bytes or a string into the relevant type (would be User here, for example), and then to do operations on the fully featured type. This is beneficial because many of orion's types use constant-time operations for things like comparisons. It's anecdotal, but in my first example, the PasswordHash type spends very little time in its serialized state because it's so easy to deserialize into the fully featured type. Making deserialization the fast path could save users from security mistakes.

Cons

  • The serialization functions in orion are helpfully named unprotected_* to signify that by using them, authors are relinquishing the constant-time features that orion provides. If we hide these functions in a serde::Serialize implementation, users may be more likely to mistakenly serialize a sensitive type and perform insecure operations on it. I realize that this somewhat contradicts the last "Pro" above, but they both seem like legitimate possibilities to consider.

Additional Considerations

  • Serialization should conform to constant-time operations already in place. For example, PasswordHash should serialize as a string using the already-implemented PasswordHash::unprotected_as_encoded function.
  • We may have to add additional mechanisms for testing to ensure that all tests pass with or without the hypothetical serde feature enabled.
@vlmutolo vlmutolo added the new feature New feature or request label Mar 27, 2021
@vlmutolo vlmutolo self-assigned this Mar 27, 2021
@brycx
Copy link
Member

brycx commented Mar 27, 2021

Fantastic feature request, as always.

I agree this change seems to make a lot of sense. There are much more benefits to this, especially usability, than I see cons for this. The only other con I see, is that it complicates the crate's feature-offering. This is unavoidable however and a sacrifice worth here.

If we hide these functions in a serde::Serialize implementation, users may be more likely to mistakenly serialize a sensitive type and perform insecure operations on it

If it's hidden behind a non-default feature and we document the risk, I believe this con is greatly outweighed by the pros.

The scope should also include orion::kdf::Salt. This is stored in some cases, such as file encryption where we'd save the salt used for key-derivation based on a password.

Some additional notes to this (not necessarily in scope of this, probably separate issues too):

  • Adding a feature is a breaking change
  • With this additional feature, I think it makes sense to add a dedicated "Features" section to the wiki. Explaining all features, how to enable them and which feature-combinations are expected to be made use of (if any combinations at all)
  • The serialization and de-serialization logic seems to be an obvious candidate for fuzzing as well

@vlmutolo
Copy link
Contributor Author

The only other con I see, is that it complicates the crate's feature-offering. This is unavoidable however and a sacrifice worth here.

I agree, especially since it seems pretty common for Rust crates to feature-gate serde support. The compile times are pretty rough to make it default.

The scope should also include orion::kdf::Salt. This is stored in some cases, such as file encryption where we'd save the salt used for key-derivation based on a password.

Good catch. I missed this one.

Adding a feature is a breaking change

Are you saying this because we’re pre-1.0 or because it’s something like a security policy for orion? Normally I think a feature would be a minor release under semver.

I think it makes sense to add a dedicated "Features" section to the wiki. Explaining all features, how to enable them and which feature-combinations are expected to be made use of (if any combinations at all)

I strongly agree. I’ve been frustrated too many times trying to look through source code to figure out exactly what a feature does.

The serialization and de-serialization logic seems to be an obvious candidate for fuzzing as well

Agreed. I think we should fuzz something like bincode::serialize itself. That way, if we ever change the internals of how orion’s types serialize, the process is still covered. This is as opposed to only fuzzing the unprotected_* and from_encoded functions, for example.

@brycx
Copy link
Member

brycx commented Mar 27, 2021

Are you saying this because we’re pre-1.0 or because it’s something like a security policy for orion? Normally I think a feature would be a minor release under semver.

I think I've maybe been mistaken on this part actually. I simply assumed it was breaking because if compiled with all-features then, it might break compatibility if a different serde, non-compatible, version is already in the dependency tree. I know of one case where upgrading getrandom to 0.2 in orion broke a build for someone using orion. Seems this has been a common problem elsewhere though, and this isn't covered by SemVer.

That said, when the alloc feature was added, this was treated as a non-breaking change. Granted, that did not add an extra dependency. According to the API evolution RFC-1105, it is considered a minor change though. Based on this, I think you're right this is a minor change.

@brycx
Copy link
Member

brycx commented Mar 27, 2021

I think we should fuzz something like bincode::serialize itself. That way, if we ever change the internals of how orion’s types serialize, the process is still covered. This is as opposed to only fuzzing the unprotected_* and from_encoded functions, for example.

Hereby you mean fuzz the serialization implementations through bincode, correct?

@vlmutolo
Copy link
Contributor Author

Hereby you mean fuzz the serialization implementations through bincode, correct?

Yes, that's what I mean. We need something that implements serde::ser::Serializer to test the serialization routines implemented in orion. We already rely on serde_json as a dev-dependency, so I would suggest sticking with that.

@brycx
Copy link
Member

brycx commented Oct 17, 2021

Re-opened since there's still the fuzzers missing.

Additionally, in extension of the issue-description, mentioning which types implement provide serde support, more types actually have support for this now. This is because it was implemented in macros like construct_public!, so it'll also include the Nonce for example. This is just a note for the future changelog entry.

@brycx
Copy link
Member

brycx commented Oct 31, 2021

Fuzzing targets have been added in orion-rs/orion-fuzz#18

@brycx brycx closed this as completed Oct 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants