Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficiency of bytes encoding #360

Closed
elliottslaughter opened this issue Sep 17, 2017 · 5 comments
Closed

Efficiency of bytes encoding #360

elliottslaughter opened this issue Sep 17, 2017 · 5 comments
Labels

Comments

@elliottslaughter
Copy link

Is serde_json integrated with serde_bytes? I was hoping that using serde_bytes would allow me to encode Vec<u8> somewhat intelligently (e.g. as a base64-encoded string, or similar). However, a very simple test seems to indicate that it doesn't do anything at all:

plain {"c":[123,124,125]}
bytes {"c":[123,124,125]}

This encoding results in an expansion ratio of up to 4:1 (depending on the values of the bytes encoded), rather than 4:3 in a base64 encoding. Needless to say, in some use cases this can be really significant reduction in message size.

The source of the test is below.

#[macro_use]
extern crate serde_derive;
extern crate serde_bytes;
extern crate serde_json;

#[derive(Serialize, Deserialize, Debug)]
pub struct MyPlain {
    c: Vec<u8>
}

#[derive(Serialize, Deserialize, Debug)]
pub struct MyBytes {
    #[serde(with = "serde_bytes")]
    c: Vec<u8>
}

fn main() {
    let plain = MyPlain { c: vec![123, 124, 125] };
    let plain_json = serde_json::to_string(&plain).unwrap();
    println!("plain {}", plain_json);

    let bytes = MyBytes { c: vec![123, 124, 125] };
    let bytes_json = serde_json::to_string(&bytes).unwrap();
    println!("bytes {}", bytes_json);
}
@dtolnay
Copy link
Member

dtolnay commented Sep 17, 2017

The default encoding is not going to change but you can plug in base64 if you want.

#[derive(Serialize, Deserialize, Debug)]
struct MyBytes {
    #[serde(with = "base64")]
    c: Vec<u8>
}

mod base64 {
    extern crate base64;
    use serde::{Serializer, de, Deserialize, Deserializer};

    pub fn serialize<S>(bytes: &[u8], serializer: S) -> Result<S::Ok, S::Error>
        where S: Serializer
    {
        serializer.serialize_str(&base64::encode(bytes))

        // Could also use a wrapper type with a Display implementation to avoid
        // allocating the String.
        //
        // serializer.collect_str(&Base64(bytes))
    }

    pub fn deserialize<'de, D>(deserializer: D) -> Result<Vec<u8>, D::Error>
        where D: Deserializer<'de>
    {
        let s = <&str>::deserialize(deserializer)?;
        base64::decode(s).map_err(de::Error::custom)
    }
}

I filed marshallpierce/rust-base64#46 to consider providing these functions directly in the base64 crate.

@elliottslaughter
Copy link
Author

That seems to work, thanks.

@dl00
Copy link

dl00 commented Dec 23, 2017

what happens if we have 2 serialization libs (for example serde_json and bincode) and we want to encode base64 in json, but not in bincode?

@clarfonthey
Copy link

@dl00 that's what Serializer::is_human_readable is for

@anguslees
Copy link

To save others from searching, the "Display" version of the above is:

pub fn serialize<S>(bytes: &[u8], serializer: S) -> Result<S::Ok, S::Error>
    where S: Serializer
{
    serializer.collect_str(&base64::display::Base64Display::standard(bytes))
}

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Development

No branches or pull requests

5 participants