Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for writing a header for nested structs #197

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ilya-epifanov
Copy link

@ilya-epifanov ilya-epifanov commented Apr 15, 2020

This covers some of the #98 and #155 use cases.
It makes #[serde(flatten)] unnecessary for nested structs, as CSV is a flat format anyway.

@ilya-epifanov
Copy link
Author

@fmorency, @Kampfkarren, could you please take a look if this covers your use cases?

@ilya-epifanov
Copy link
Author

ilya-epifanov commented Apr 15, 2020

In particular, take a look at this test:

#[test]

@ilya-epifanov
Copy link
Author

@BurntSushi could you please take a look?

@CGMossa
Copy link

CGMossa commented Jun 21, 2020

I've tried to see if this works with my use case, which is nested structs and wishing to write the headers for them, and it didn't.

@jeremias-blendin-intel
Copy link

The patch works for my simple use case, two nested structs. It would help me if the PR got merged.

@0xpr03
Copy link

0xpr03 commented Jun 28, 2020

Would help here also for flattening stuff which I don't want to patch in one big struct.

@Luthaf
Copy link

Luthaf commented Nov 26, 2020

This branch works fine for me for serializing nested structs. Thanks a lot @ilya-epifanov!

@amrhassan
Copy link

Thank you so much @ilya-epifanov. This however fails when serializing nested structs wrapped in enums.

use serde::Serialize;

#[derive(Serialize)]
struct Row {
    nested_one: Option<Nested>,
    nested_two: Nested,
}

#[derive(Serialize)]
struct Nested { x: i32, y: i32 }

fn main() {
    let mut writer = csv::Writer::from_writer(std::io::stdout());
    let row = Row { nested_one: Some(Nested { x: 1, y: 2 }), nested_two: Nested { x: 3, y: 4}};
    writer.serialize(row).unwrap();
}

This panics with Error(UnequalLengths { pos: None, expected_len: 3, len: 4 }) after writing:

nested_one,x,y
1,2,3,4

@haixuanTao
Copy link

haixuanTao commented Mar 1, 2021

Ok, @BurntSushi, this is what I got.

Summary

There is 2 main step in writing the CSV: - Writing the header and - Writing the values.

Currently the values are written properly even when they are nested in a struct. On the other hand, the header for a struct is written as the field name of the struct and not the nested field of the struct.

That's the problem this PR tries to solve and that is why this PR mainly modifies the behaviour of struct SeHeader.

How it is implemented

Basically, this PR hack the serialize_field by making the writing of the header field conditional to the type of the field.
It would behave as before for any type that is not a struct, but, if it is a struct, it is not going to write it down. The state will move down its nested field and everything will work.

In order to check the type, the PR has set a new field for the struct SeHeader. This new field is called structs_written (Line 451, 456-459). It is a counter of the number of struct it has seen since the beginning of the writting of the header. This counter is incremented each time serialize_struct is called.

Therefore, if the structs_written counter changes when encountering a new field that means the new field is a struct(Line 802-805).

Problem

This however does not work when the struct is encapsulated in an option as @amrhassan pointed out.

The underlying problem is that the hack does not work this time as serialize_struct is not called when the struct is in an option. Instead serialize_some is called.

Now we could try to patch serialize_some as well and increment structs_written but i did not find a patch for it, and it will look a bit shaky.

A second problem I faced is that we cannot use the #[serde(flatten)], otherwise the writting will fail with serializing maps is not supported ... called in serialize_map.
This is kind of counter-intuitive as it is the recommended way to use serde serialization.

I think it will be better to redo this PR for nesting by implementing the serialize_map method with some conditioning on the input and use serde #[serde(flatten)].

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants