Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Struct flattening not working #115

Closed
Tracked by #397
ebkalderon opened this issue Jun 10, 2018 · 23 comments · Fixed by #455
Closed
Tracked by #397

Struct flattening not working #115

ebkalderon opened this issue Jun 10, 2018 · 23 comments · Fixed by #455

Comments

@ebkalderon
Copy link
Contributor

I'm trying to use Serde's struct flattening functionality with RON, using the Rust code below:

#[derive(Deserialize, Serialize)]
struct Main {
    #[serde(flatten)]
    required: Required,
    #[serde(flatten)]
    optional: Optional,

    some_other_field: u32,
}

#[derive(Deserialize, Serialize)]
struct Required {
    first: u32,
    second: u32,
}

#[derive(Deserialize, Serialize)]
struct Optional {
    third: Option<u32>,
}

This is the RON that I am attempting to deserialize:

Main(
    first: 1,
    second: 2,
    third: Some(3),
    some_other_field: 1337,
)

When I try to deserialize the target struct from RON, I get the following error:

thread 'example::test' panicked at 'called `Result::unwrap()` on an `Err` value: Parser(ExpectedMap, Position { col: 9, line: 2 })', libcore/result.rs:945:5        
note: Run with `RUST_BACKTRACE=1` for a backtrace.

On the other hand, deserializing from the following TOML string works as intended:

first = 1
second = 2
third = 3
some_other_field = 4

Any indication on what might be going wrong?

@kvark
Copy link
Collaborator

kvark commented Jun 14, 2018

Interesting. I haven't worked with struct flattening before. Wondering if we need any special handling for those.

@ebkalderon
Copy link
Contributor Author

@kvark It looks like there are two kinds of struct flattening available in Serde. There's flattening into a map, which acts as a "catch-all" for fields that aren't found in your struct, like this:

#[derive(Deserialize, Serialize)]
struct MyType {
    first: u32
    second: u32,
    #[serde(flatten)]
    everything_else: HashMap<String, Value>,
}

Then there's flattening into a struct, which operates in a similar manner and is demonstrated in the example code I showed in the first comment of the issue.

Not sure if this needs any special handling on RON side either, but I can't think of what else might be going wrong. It appears that toml handles it correctly, but I can't think of what they might be doing different.

@kvark kvark added the bug label Jun 14, 2018
@torkleyy
Copy link
Contributor

torkleyy commented Aug 5, 2018

Blocked by serde-rs/serde#1346

@torkleyy
Copy link
Contributor

torkleyy commented Jan 9, 2019

Well, it seems we need some workaround since the serde issue won't be fixed.

@solarretrace
Copy link

I've hit this problem, and found a workaround by using both a struct and map together.

I have an enum variant that looks like this:

pub struct ObjectCommon { x: u32, y: u32 }

pub enum Object {
    Cycle {
        #[serde(flatten)]
        common: ObjectCommon,
        extra: u32,
    },
}

I can successfully deserialize it from this:

Cycle ({ x: 0, y: 0, extra: 0 })

Note the extra curly brackets.

@Frizi
Copy link

Frizi commented May 10, 2019

This bug makes it impossible to use palette crate's types with ron, as it uses struct flattening everywhere.

@torkleyy
Copy link
Contributor

torkleyy commented Jun 6, 2019

The only way we could fix this is by allowing RON to deserialize the RON struct syntax as map. This doesn't seem to be a good idea, though.

@Boscop
Copy link

Boscop commented Feb 26, 2020

Any updates on this? :)

@Boscop
Copy link

Boscop commented Dec 20, 2020

Any update on this now? I just ran into this and got a very unhelpful error:

Caused by:
    2:1: Expected map

@kvark
Copy link
Collaborator

kvark commented Dec 21, 2020

@Boscop sorry, it looks like nobody is looking into this. Any help is appreciated!

@juntyr
Copy link
Member

juntyr commented Aug 21, 2022

One could get flattening to work with this slightly frightening code:

fn deserialize_map<V>(mut self, visitor: V) -> Result<V::Value>
    where
        V: Visitor<'de>,
    {
        /* START */
        let mut wc = self.bytes;

        // remove any struct name
        if !self.newtype_variant {
            let _ = wc.identifier();
            wc.skip_ws()?;
        }

        // check if this could be a struct
        if self.newtype_variant || wc.consume("(") {
            wc.skip_ws()?;

            // only serialise a struct as a map if it definitely looks like a struct
            if !wc.check_tuple_struct().unwrap_or(true) {
                let _ = self.bytes.identifier();
                return self.deserialize_struct("", &[], visitor);
            }
        }
        /* END */

        self.newtype_variant = false;

        ...

While no existing test cases fail, this does seem like Pandora's box for rather weird bugs in the future. In particular, it has to accept any struct name that could be there, circumventing the usual equality checks we have in deserialize_struct.

juntyr added a commit to juntyr/ron that referenced this issue Aug 21, 2022
juntyr added a commit to juntyr/ron that referenced this issue Aug 21, 2022
@juntyr
Copy link
Member

juntyr commented Aug 21, 2022

You can find a more refined implementation with some first test cases in #403 - I'm not sure if this is the right path forward, though.

@ericsampson
Copy link

Any thoughts on an approach to move forward?

@juntyr
Copy link
Member

juntyr commented Apr 23, 2023

I have been thinking again about #403 again recently but have not come up yet with a better solution. While we could land #403 at any time, having this special case of deserialising a map from what RON sees as a struct just seems wrong. Unfortunately, this also does not seem to be a case where serde uses some specially named Deserializer struct or anything like that which we could detect to only enable this bypass in a very limited set of circumstances.

@ericsampson If you are anyone else has some ideas of how to improve #403 (with a better impl or more restricted application) I would be very happy to merge the PR and get flattening to work. Right now it still seems like a too-large can of worms.

@juntyr
Copy link
Member

juntyr commented Apr 23, 2023

I did a bit more digging and flatten is even more broken than I had realised the last time. #403 would only sort-of-fix deserialising from a RON struct into a flattened struct, but it does not solve serialising at all. At serialisation time, serde uses Serializer::serialize_map to output the flattened struct - so far so good? If it forces RON to output a map and RON could then read in that map, all would be good even if serde would force ugly map syntax on RON users. Unfortunately, no.

Serialising has no notion of identifiers but deserialising does. So while serialising a flattened struct outputs a map with stringified keys, at deserialisation time serde then asks for a map which contains identifiers as keys. In RON identifiers are typed to not be strings. So while serialising outputs {"field": value}, deserialising expects {field: value}. If RON now starts to accept strings where an identifier should be, it is forced to repeat all of JSON's problems just because serde was designed for JSON and, as it seems, no other format in mind. And this just does not seem like a good path - RON's codebase cannot consist of 50% special cases to handle serde's weird data model to the point where we can no longer understand this code.

I think there are two ways out of this and I'm sorry that they both sound bad right now. I would be delighted if someone comes up with something better.

  1. RON never supports flattened structs. Given that this issue has clearly attracted interest, this would pain me.
  2. RON gets an additional deserialising mode that just tries to make deserialising work, somehow (serialising flattened structs would still be broken since we just cannot start to treat every serialize_map call as if it were a struct - serde would need to fix this with new API and the willingness for that does not seem to be there). This would split RON in two, a strict mode and a serde mode, and would complicate the code base.
  3. (in no way serious) the community releases a new crate serde2 that works with existing serde impls to ease adoption but cleans up both the serialisation and deserialisation APIs to mirror each other, be as strongly typed as possible, and exposes serde's internal buffering API in a way that formats can overwrite it and avoid the problems with, e.g. untagged enums.

@juntyr
Copy link
Member

juntyr commented Apr 23, 2023

Ok, with some extra playing around there is a serious 4th option. #453 would at least allow flattened structs to roundtrip through RON. However, the syntax would still use maps since that's what serde provides us with and getting serialisation and deserialisation to work with RON structs instead would require just too many contortions right now. Thus, #453 is something we could land rather soon and thus at least have some support for flattening in RON.

@juntyr
Copy link
Member

juntyr commented Apr 23, 2023

@ebkalderon @solarretrace @Frizi @ericsampson @kvark @torkleyy What are your thoughts on #453 as a minimal fix?

@clouds56
Copy link
Contributor

In no way serious,

new crate serde2 that works with existing serde impls to ease adoption

This would not work, since the output of #[derive(serde::Serialize)] would generate code like this

#[derive(serde::Serialize)]
struct A {
  a: String,
}

to

#[allow(unused_extern_crates, clippy::useless_attribute)]
extern crate serde as _serde;
#[automatically_derived]
impl _serde::Serialize for A {
    fn serialize<__S>(
        &self,
        __serializer: __S,
    ) -> _serde::__private::Result<__S::Ok, __S::Error>
    where
        __S: _serde::Serializer,
    {
        let mut __serde_state = _serde::Serializer::serialize_struct(__serializer, "A", false as usize + 1)?;
        _serde::ser::SerializeStruct::serialize_field(&mut __serde_state, "a", &self.a,)?;
        _serde::ser::SerializeStruct::end(__serde_state)
    }
}

and a flatten struct

#[derive(serde::Serialize)]
struct A {
  a: String,
  #[serde(flatten)]
  b: BTreeMap<String, bool>,
}

to

#[allow(unused_extern_crates, clippy::useless_attribute)]
extern crate serde as _serde;
#[automatically_derived]
impl _serde::Serialize for A {
    fn serialize<__S>(
        &self,
        __serializer: __S,
    ) -> _serde::__private::Result<__S::Ok, __S::Error>
    where
        __S: _serde::Serializer,
    {
        let mut __serde_state = _serde::Serializer::serialize_map(__serializer, _serde::__private::None)?;
        _serde::ser::SerializeMap::serialize_entry(&mut __serde_state, "a", &self.a)?;
        _serde::Serialize::serialize(&&self.b, _serde::__private::ser::FlatMapSerializer(&mut __serde_state))?;
        _serde::ser::SerializeMap::end(__serde_state)
    }
}

Note we lost "A" in the first place.

So in order to have the derive working, we have to fork serde and replace it as a drop-in solution.

[patch.crates-io]
serde = { git = 'https://github.com/example/serde2.git' }

@dtolnay
Copy link

dtolnay commented Apr 23, 2023

For comparison, serde_starlark (https://github.com/dtolnay/serde-starlark) supports flatten, and Starlark syntax is structurally extremely similar to RON.

Example: https://github.com/bazelbuild/rules_rust/blob/84f1d0657d28df8ddea71c492412b9480c70025d/crate_universe/src/utils/starlark.rs#L179-L180

That data structure gets serialized to Starlark which looks like this: https://github.com/dtolnay/cxx/blob/d8427f9c251cac0d6c35a26c0784319c95b208d8/third-party/bazel/BUILD.winapi-util-0.1.5.bazel#L17

Notice that name and deps in the output are coming from fields of RustLibrary, while many of the other fields like srcs and crate_root are coming from the flattened CommonAttrs.

@juntyr
Copy link
Member

juntyr commented Apr 23, 2023

For comparison, serde_starlark (https://github.com/dtolnay/serde-starlark) supports flatten, and Starlark syntax is structurally extremely similar to RON.

Example: https://github.com/bazelbuild/rules_rust/blob/84f1d0657d28df8ddea71c492412b9480c70025d/crate_universe/src/utils/starlark.rs#L179-L180

That data structure gets serialized to Starlark which looks like this: https://github.com/dtolnay/cxx/blob/d8427f9c251cac0d6c35a26c0784319c95b208d8/third-party/bazel/BUILD.winapi-util-0.1.5.bazel#L17

Notice that name and deps in the output are coming from fields of RustLibrary, while many of the other fields like srcs and crate_root are coming from the flattened CommonAttrs.

Thanks @dtolnay for sharing that crate! I had a quick look and it seems like the FunctionCall wrapper struct is used to serialise a flattened struct like a struct? Such a wrapper seems very useful indeed and could perhaps be extended to do deserialisation as well. It would definitely be possible to add such a wrapper to RON, though that would mean that users would need to tie their Deserializer-generic code to RON-specifics. Is there a way this could be generalised?

@clouds56
Copy link
Contributor

clouds56 commented Apr 23, 2023

@juntyr just said what I'd like to say.

@juntyr
Copy link
Member

juntyr commented Apr 23, 2023

In no way serious,

new crate serde2 that works with existing serde impls to ease adoption

This would not work, since the output of #[derive(serde::Serialize)] would generate code like this

My very quick (and again in no way seriously considered idea) was to add blanket impls for existing Serialize and Deserialize impls and wrapper structs for existing Serializers and Deserializers. Any old code could then fall back to the existing serde behaviour. However, formats supporting the v2.0 version could use the new API, and if the data supports them as well then more strictly typed interactions would be possible. The wrappers and blanket impls would work similar to your idea in serde-rs/serde#1346 (comment) and just delegate to existing APIs, e.g. flattened structs would continue to use maps.

@juntyr
Copy link
Member

juntyr commented Apr 23, 2023

For comparison, serde_starlark (https://github.com/dtolnay/serde-starlark) supports flatten, and Starlark syntax is structurally extremely similar to RON.
Example: https://github.com/bazelbuild/rules_rust/blob/84f1d0657d28df8ddea71c492412b9480c70025d/crate_universe/src/utils/starlark.rs#L179-L180
That data structure gets serialized to Starlark which looks like this: https://github.com/dtolnay/cxx/blob/d8427f9c251cac0d6c35a26c0784319c95b208d8/third-party/bazel/BUILD.winapi-util-0.1.5.bazel#L17
Notice that name and deps in the output are coming from fields of RustLibrary, while many of the other fields like srcs and crate_root are coming from the flattened CommonAttrs.

Thanks @dtolnay for sharing that crate! I had a quick look and it seems like the FunctionCall wrapper struct is used to serialise a flattened struct like a struct? Such a wrapper seems very useful indeed and could perhaps be extended to do deserialisation as well. It would definitely be possible to add such a wrapper to RON, though that would mean that users would need to tie their Deserializer-generic code to RON-specifics. Is there a way this could be generalised?

Ok, for serialising, the problem is that struct fields need &'static str fields while maps support any serialisable key. While we could have a Serializer that only allows maps with string keys, we have no way of knowing if the keys were static or not. So they would need to be interned (for which a few crates exist but this would still be a bit of a problem).

For deserialising, the wrapper would need to have access to the struct/variant name and fields as &'static strs to supply them to the Deserializer such that it can deserialise a struct (variant), the visit calls for which could then be slightly modified such that the inner flattened struct still thinks it's just seeing a map. However, if the user has to manually give these keys, then a lot of the benefit of flattening seems to be gone.

Therefore, a clean generalised approach seems quite difficult to me.

A specific format could probably support such wrapper structs by using some in-band signalling to communicate that we're looking at a flattened struct. However, that would again be format specific. So the only option I see right now would be a general crate that uses in-band signalling in some way such that non-supporting formats still do something ok but supporting formats can actually handle flattened structs nicely. However, I've not yet given the details here any thought.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
10 participants