Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serde support #46

Closed
dtolnay opened this issue Sep 17, 2017 · 14 comments
Closed

Serde support #46

dtolnay opened this issue Sep 17, 2017 · 14 comments

Comments

@dtolnay
Copy link

dtolnay commented Sep 17, 2017

If you add serialize and deserialize functions to this crate, Serde will be able to use it to easily serialize fields to base64 and deserialize from base64.

extern crate base64;

#[derive(Serialize, Deserialize)]
struct MyBytes {
    #[serde(with = "base64")]
    c: Vec<u8>,
}

There is one possible implementation in serde-rs/json#360.

I would expect this to be behind a Cargo feature, something like:

base64 = { version = "0.6", features = ["serde"] }
@marshallpierce
Copy link
Owner

Unless I'm missing something, it doesn't seem like that approach would allow the user to specify the config to use when serializing.

@dtolnay
Copy link
Author

dtolnay commented Sep 19, 2017

Just the base64::STANDARD config would be a good start. Is that the same encoding Go does by default for bytes in JSON?

package main

import (
	"encoding/json"
	"fmt"
)

func main() {
	type E struct {
		Bytes []byte
	}
	j, err := json.Marshal(E { Bytes: []byte("data0123456789") })
	if err != nil {
		panic(err)
	}
	fmt.Println(string(j))
}

A configurable variation may involve serialize and deserialize helpers on Config.

static MY_CUSTOM = base64::Config {
    char_set: /* ... */,
    pad: /* ... */,
    strip_whitespace: /* ... */,
    line_wrap: /* ... */,
};

#[derive(Serialize, Deserialize)]
struct MyBytes {
    #[serde(with = "base64::URL_SAFE")]
    a: Vec<u8>,
    #[serde(with = "MY_CUSTOM")]
    b: Vec<u8>,
}

@marshallpierce
Copy link
Owner

Hmm... It seems like this could be solved without any magic strings representing the different configs, etc, by simply providing a snippet of sample code that people could modify for their own purposes akin to what was in the json example you linked to. I'd like to stay away from baking in magic strings or any other sort of pseudo-API, or embedding the standard config as the only realistic choice. (For instance, for most users, what they probably want is standard with no padding since it saves a byte or two.)

In general, I think that serialization decisions should be made by the application, not by libraries they depend on. I would rather that applications make explicit (but not overly burdensome) choices for how they want to serialize, rather than just going with what their particular framework spoon-feeds them. (That's how you end up with reimplementations of PHP serialization in other languages to interop with crappy web services...) For possibly related context on where I'm coming from on this, I had a discussion about the merits of the serde semi-magical approach on another project.

@dtolnay
Copy link
Author

dtolnay commented Sep 20, 2017

Thanks! I haven't had a chance to read the thread you linked but I am unclear what it is we disagree on, if anything.

I'd like to stay away from baking in magic strings or any other sort of pseudo-API

When you say magic strings, is that referring to with = "base64::URL_SAFE"? That is just the path to base64::URL_SAFE. I guess I don't normally conceptualize named variables / modules / statics as magic strings -- not any more than the word "base64" is a magic string when I write extern crate base64, for example.

I would rather that applications make explicit (but not overly burdensome) choices for how they want to serialize

Okay! I think this is exactly what MY_CUSTOM is for in my previous comment.

@marshallpierce
Copy link
Owner

I see what you mean -- more my lack of real world use of serde showing than any real disagreement, I think. The stringy nature is just the way we have to expose the serializer to the code gen machinery, then? Regrettable, but at least it's still checked at compile time.

It seems like a conflation of concerns to bolt this onto Config, though. A config is just a config, so it would make more sense to me to have a separate struct to handle the serde integration that has a Config member. However, I'm not really clear on what's going on with the with customization. What is the thing that gets pointed to supposed to be? I saw in your json example it's a module, but I clicked around a little and didn't see where that's documented.

@marshallpierce
Copy link
Owner

OK, found https://serde.rs/field-attrs.html. I'm not sure I'm understanding this correctly, though, since it looks like with has to be a module, but in your example above you're using a Config. Can you go into more detail about the extent of what you can do with with?

@dtolnay
Copy link
Author

dtolnay commented Sep 20, 2017

It doesn't matter whether with is a module or crate or const or type or whatever else. We take whatever "thing" you write and invoke thing::serialize to serialize the field and thing::deserialize to deserialize the field, as if those functions were the implementation of Serialize and Deserialize for the field's type. I think using a base64 config to configure how a field is represented in base64 is concise, expressive, sufficiently explicit, and intuitive.

@marshallpierce
Copy link
Owner

Hmm... still looks like a violation of SOLID to me. I don't like having to take a route that the consumer of the library couldn't feasibly replace; that's a design smell.

Anyway, be that as it may, can you comment on https://github.com/alicemaz/rust-base64/compare/serde?expand=1? I think I'm doing more or less the same thing as your CUSTOM approach above, but I get compiler woes:

error[E0433]: failed to resolve. Use of undeclared type or module `B64`
  --> src/serde.rs:54:14
   |
54 |     #[derive(Serialize, Deserialize)]
   |              ^^^^^^^^^ Use of undeclared type or module `B64`

@dtolnay
Copy link
Author

dtolnay commented Sep 21, 2017

Oh I see, the generated code blindly invokes B64::serialize(/* ... */) instead of what we would want which is B64.serialize(/* ... */). I filed serde-rs/serde#1059 to follow up.

In the meantime, top-level base64::serialize and base64::deserialize functions would be helpful. Those would be usable concisely as #[serde(with = "base64")] and assume a default configuration, like how base64::encode and base64::decode assume the STANDARD configuration. And when necessary, the user can continue to handle custom configurations the way we did in serde-rs/json#360.

@marshallpierce
Copy link
Owner

Sorry to be so contrary, but I don't want to add an awkward interim solution that I would then have to deprecate later (or maintain forever). It also creates a default choice that isn't necessarily suitable for everyone. For instance, I suspect that many users will in fact want STANDARD_NO_PAD, not STANDARD to save 1 byte on average. (Padding is entirely unnecessary; un-padded base64 can be decoded just fine, and having padding just creates an opportunity to have done it wrong.)

However, I don't want to leave serde users high and dry, so I slapped together this band-aid to ease creating the type wrapper around a config: https://github.com/marshallpierce/base64-serde/blob/master/src/lib.rs. I think this should be a good-enough solution until serde-rs/serde#1059 is solved. (BTW, I'm a macro noob, so happy to take feedback on my macro, etc.) Anyway, if that looks reasonable, I'll go ahead and write some docs and publish that. I made it a separate project so that it can be more conveniently abandoned once we have a better approach available.

@dtolnay
Copy link
Author

dtolnay commented Sep 21, 2017

Base64-serde looks good to me. I filed some issues but nicely done. Thanks for putting that together!

49% of crates that depend on base64 also depend on serde so this will be a big help.

It also creates a default choice that isn't necessarily suitable for everyone.

Hmm, how would this be worse than base64::encode and base64::decode choosing a default behavior?

I suspect that many users will in fact want STANDARD_NO_PAD, not STANDARD to save 1 byte on average.

I think if STANDARD with padding is good enough for Go to use for all byte slices by default, it's good enough for us. It seems like a safe default that has probably better interoperability with remote systems than STANDARD_NO_PAD.

@marshallpierce
Copy link
Owner

Thanks for the feedback on base64-serde; I'll work on what you've suggested.

Hmm, how would this be worse than base64::encode and base64::decode choosing a default behavior?

It's not, but I didn't write that API, and I wouldn't have chosen to make it available if I was involved in the crate then. :)

@marshallpierce
Copy link
Owner

https://crates.io/crates/base64-serde is now a thing. Anything else to do for this issue?

@dtolnay
Copy link
Author

dtolnay commented Sep 22, 2017

LGTM, thank you!

@dtolnay dtolnay closed this as completed Sep 22, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants