-
Notifications
You must be signed in to change notification settings - Fork 100
Implement semantic tagging of values #157
Comments
It needs a modified Serde: https://github.com/vmx/serde/tree/tagged-value-deserialization Part of pyfisch#157.
Here's my version of the deserializer:
I'm not happy with the result. One missing thing is to deserialize the value (without the tag) by default. I just couldn't get it working. I always got trait bounds errors. I hope someone can help with that. I also don't really like the |
It needs a modified Serde: https://github.com/vmx/serde/tree/tagged-value-deserialization Part of pyfisch#157.
@vmx Thanks for trying. Unfortunately I still have not received any feedback on the serde PR. I fear it might stall and go nowhere. Therefore I will only work on the deserialization part once it has received some positive comments by the serde maintainers. |
@dignifiedquire, @vmx, @rklaehn (@Actyx), @mcamou, @bantonsson, @Alexander89 and all others interested in support for CBOR tagged values I would appreciate your input which of the options outlined below you would prefer and if you have any other suggestions. Two weeks have passed since serde-rs/serde#1643 was closed. I think it is unlikely that serde will incorporate support for tags any time soon or even at all. Now there are different ways forward:
|
I think that is generally spoken a good idea. The YAML and MessagePack implementations for Serde both use a general parsing library and have a Serde layer on top. If someone needs tags and doesn't necessarily need Serde, you could still use a the same well tested, established code without needing to look into alternatives.
Does
I think opt-on for tags (with or without fork) is the way to go. This way hopefully other library authors that need tags will support that version and it will show how useful that feature is. This will at least keep the hope up for me that perhaps Serde will eventually support such a feature in the distant future. |
Please don't hit me for the suggestion, but essentially the problem is that there is a missing communication channel between the format and the serializer to allow tags to work. Would it help to just put the damn tag into a threadlocal so that a special visitor can get the tag when it needs to know how to interpret the data? As a stopgap until we can convince dtolnay to help formats that don't exactly fit the serde data model a tiny little bit... Not 100% sure what the consequences are, but it does seem like thread locals are pretty cheap in rust when using optimize: https://godbolt.org/z/MZDnX4 So if users are made aware of the downside (reserves an additional 16 bytes for every thread, or every thread that has ever been used for CBOR stuff, not sure how exactly it works), then this might work. |
OK, I just could not resist trying this out (using a thread-local to get the info over the fence): rklaehn#2 |
Not sure what the implications are of option 4, forking serde. Serde is as close to a standard library as it gets, and lots of crates do offer serde support behind a feature flag. That would all no longer work with a forked serde, right (not entirely sure how that works, as I am rather new to rust)? |
I just want to mention that I've put more thought into the use-case I have. I'll probably go with something that doesn't need Serde support. So I'd be happy to see a separation between the CBOR processing and Serde (like e.g. serde_yaml or msgpack-rust are doing it) and having Tag support in the non-Serde part. I'd also happy to contribute some time helping with such a refactoring. |
@vmx I played around with deserializing CBOR without serde: https://github.com/pyfisch/minicbor It takes a One pain-point if something like this should serve as a backend to serde-cbor is that it needs to work with both |
@pyfisch Wow, nice! That looks really promising. I sadly have zero time atm to dig deeper or thinking about how serialization would look like. Hopefully I'll have next year. |
Serialization is the "boring" part because you can just output some bytes. Getting deserialization right is more difficult. The majority of bugs in this crate were in the deserialization part. Introducing a token iterator is probably makes deserialization much easier than the approach used in serde-cbor where we go directly from bytes to serde structures. |
Now in v0.11! |
Support for CBOR tagged values is the most requested feature in serde_cbor (see #3, #56, #129, #151). Tags allow users to "give [a data item] additional semantics while retaining its structure". They enable the definition of common data types and they are needed to correctly implement some specifications that use CBOR (see for example COSE).
There is a hack (with a few variations) commonly suggested to implement tags: Create a struct with a special name and two fields, one for the tag and another for the value. Now use this type to represent tagged values and give it special treatment in the
Serializer and
Deserializer` to convert tags.This has two downsides:
Design criteria
serde_cbor::Value
s with tags.Tasks
serde::ser::Serializer
trait to allow the serialization of tagged values. (branch)serde::de::Visitor
to visit a tagged value and make other necessary changes for deserialization.serde_cbor
. (exprimental branch)serde_cbor
.Tagged
variant toserde_cbor::Value
url::Url
with tag 32The text was updated successfully, but these errors were encountered: