Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CBOR support #163

Closed
pyfisch opened this issue Sep 28, 2015 · 9 comments
Closed

CBOR support #163

pyfisch opened this issue Sep 28, 2015 · 9 comments
Labels

Comments

@pyfisch
Copy link
Contributor

pyfisch commented Sep 28, 2015

I have written a CBOR serialization and deserialization crate for serde. The conforms to the CBOR spec, is tested and documented. Is it possible to move it to the serde-rs organization? I would also really like to hear some code review.

Serde supports annotations to values to assign a special meaning. For example the tag 32 is used to annotate URIs. To support these tags within serde a method for serde::de::Visitor to for tags would be useful. The signature could be something like fn visit_tag<D>(&mut self, tag: u64, _deserializer: &mut D) -> Result<Self::Value, D::Error> and default to ignoring the tag and just visit the inner value. There could be a similiar method for ser::Serializer.

@erickt
Copy link
Member

erickt commented Sep 28, 2015

@pyfisch: I'd be fine if you want to have CBOR be a part of serde-rs. Have you and @BurntSushi figured out if you can consolidate your projects?

On tags, I was struggling as well to figure out a good way to express them when I ported https://github.com/BurntSushi/rust-cbor over to serde. I did a terrible hack where I inspected the struct name and did something special with that, but I wasn't particularly happy with that approach. Having a visit_tag is interesting, but I do worry a little that it would be specific just to CBOR. Would you need an equivalent construct on the Serializer?

@pyfisch
Copy link
Contributor Author

pyfisch commented Sep 29, 2015

I already compared my code to the implementation by @BurntSushi. The code is very similiar but rust-cbor uses an intermediate syntax while serde-cbor directly deserializes the values. We did not yet discuss if we can cosolidate projects.

I would need visit_tag both for deserializer and serializer, but it would probably easier to start with just deserialization. Tags are not only used in CBOR, also BSON has tags for binary data with values ranging from 0 to 255. Also messagepack has extension types ranging from -128 to 127 while the negative codes are reserved. There are probably some more formats using a similiar extension mechanism. If the visit_tag method would take a string describing the format like "cbor", or "bson", and supported different numeric and string types as tags it could be generic over many formats.

@BurntSushi
Copy link
Contributor

I have clearly dropped the ball on using serde (in all my crates, not just cbor). The worst part is that I don't see myself having cycles to dive into it in the immediate future. Therefore, I shouldn't stand in the way of progress.

I have two suggestions:

  1. Let my implementation of cbor languish as "the old one for rustc-serialize" and move to @pyfisch's implementation for the future (which is presumably serde). I personally will probably continue to use rustc-serialize for the time being because it is more ergonomic on stable Rust. Once the world moves to serde and my crate no longer has a use, I'll happily transfer the crate name cbor to whoever wants it (so long as there is a major version bump).
  2. I'd be very weary of adding special "tag" support to the serde traits. I think at least one important question to answer is whether specialization will solve the problem to some degree. If it's conceivable that it could, it might be worth holding off. I will say though that some way to achieve tags is really important. It's a vital feature of CBOR in my experience.

@pyfisch
Copy link
Contributor Author

pyfisch commented Oct 7, 2015

  1. This is ok for me. But I don't think that my crate will need the name cbor.
  2. Can you explain how specialization may solve the problem?

@BurntSushi
Copy link
Contributor

@pyfisch RE 2: No, I can't. :-( The only thing I can say is that when I was writing cbor, I thought to myself, "Hmm, specialization might be useful for tags." But I can't remember anything else, and that may have been specific to rustc-serialize's design.

@erickt
Copy link
Member

erickt commented Oct 19, 2015

@pyfisch / @BurntSushi: Do you think it would ever make sense to factor out the cbor parser and serializer from serde and rustc-serialize, so it could be shared between the two, or whatever future mechanism came along? One that could model these tagged values better than something serde could do?

@BurntSushi
Copy link
Contributor

@erickt If you look at my cbor crate, there is some code sharing, but I mostly failed to reuse code between parsing CBOR into an algebraic data type and decoding directly in rustc-serialize.

@Parakleta
Copy link

I'm not sure exactly how all of this code is structured but I'd like to point out that I think on the decoding side the tag has to be able to change (or possibly stack) the visitor used downstream of itself, so perhaps there needs to be a map somewhere between tag values and visitors.

In some sense this can be considered similar to differentiating maps from sequences, since a map is really just a special interpretation of a sequence (even number of elements, alternating keys and values).

As a complicated example of a tag consider http://cbor.schmorp.de/stringref. In this case tag 256 declares that every string (text or byte) contained within the tag will be indexed and any tag 25 will contain exactly an integer which is an index into the string table. This means that when the tag 256 is seen then the visitor needs to be wrapped with one that can inspect every other string decoding so that it can build its table. If another tag 256 is encountered then the previous one needs to be occluded until the inner one is finished and then the outer one is restored.

@dtolnay
Copy link
Member

dtolnay commented Jul 31, 2016

The tagging discussion is continued in #455. It looks like the solution will be specialization as @BurntSushi guessed.

@dtolnay dtolnay closed this as completed Jul 31, 2016
rubdos pushed a commit to rubdos/serde that referenced this issue Jun 20, 2017
Update num_macros to current nightly

Fixes num_macros build for current nightly, broken by [this](https://github.com/rust-lang/rust/pull/31487/files) PR on rust.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

5 participants