Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xml #39

Closed
wants to merge 50 commits into from
Closed

Xml #39

wants to merge 50 commits into from

Conversation

oli-obk
Copy link
Member

@oli-obk oli-obk commented Mar 19, 2015

would you be so kind as to review the serde_macros and Deserializer changes?

current state:

  • xml to struct deserialization
  • deserialize bool, int, string from <anytagname>value</anythingelse>
  • deserialize sequences (tuple, array, vector) as struct member
  • deserialize escaped chars (&abcd;)
  • deserialize CDATA
  • deserialize enumerations
  • deserialize arrays of enumerations
  • deserialize errors instead of assertions
  • more deserialize tests
  • parse to dom tree
  • remove debug print! messages
  • struct to xml serialization

anti-features (just ignoring stuff I don't like):

  • ignore namespaces
  • skip xml comments
  • skip xml version tag
  • ignoring xml-attributes

things that would be nice to have but might not be possible

  • xsd verification
  • sequences of sequences (how would these even look like in xml?)
  • attributes to collapse xml elements that only contain a single type of element.

@oli-obk oli-obk changed the title Review Xml Xml Mar 19, 2015
@erickt
Copy link
Member

erickt commented Mar 20, 2015

First off this is awesome. I'll go through it tonight. Second, I was actually thinking about factoring json out into a separate crate as folks might not want json, or someone might come up with a better one down the road. What do you think?

@oli-obk
Copy link
Member Author

oli-obk commented Mar 20, 2015

I've read about cargo having a few problems with multiple inheritance. Someone might be using xml and json crates that require different serde versions, which might get messy. This needs to be checked.

@oli-obk
Copy link
Member Author

oli-obk commented Mar 20, 2015

Also: someone implementing a better/different json-deserializer won't have a conflict. During deserialization you choose which Deserializer to use; Rust doesn't care/know that two Deserializers deserialize the same format.

@oli-obk
Copy link
Member Author

oli-obk commented Mar 24, 2015

@erickt there are a few xml-specific things I'd like to discuss. Should I start a thread on the rust user forum or a wiki entry or an issue here?

@erickt
Copy link
Member

erickt commented Mar 26, 2015

@oli-obk: Hi there! Sure, a thread would be wonderful. And in case you haven't noticed, I've been slowly pulling bits and pieces from your PR, so thanks for all the hard work so far!

@oli-obk
Copy link
Member Author

oli-obk commented Mar 26, 2015

http://users.rust-lang.org/t/rfc-serde-xml-support/737

Yea I noticed, thanks. Although after merging Xml, some coverage checks would probably be good to figure out if I'm actually using all features I introduced.
Example: I don't think the end function of Sequences and Maps ever needs to be implemented. All the code could be in SeqVisitor::visit_value or MapVisitor::visit_key. Then again, it might be more readable to put it in end

@oli-obk
Copy link
Member Author

oli-obk commented Mar 27, 2015

separating it out into its own crate isn't really working, and I don't know why:

D:\rust-serde\xml\tests\test_xml.rs:112:5: 112:18 error: the trait `serde::ser::Serialize` is not implemented for the type `Simple` [E0277]
D:\rust-serde\xml\tests\test_xml.rs:112     test_parse_ok(&[
                                            ^~~~~~~~~~~~~
D:\rust-serde\xml\tests\test_xml.rs:112:5: 112:18 error: the trait `serde::de::Deserialize` is not implemented for the type `Simple` [E0277]
D:\rust-serde\xml\tests\test_xml.rs:112     test_parse_ok(&[
                                            ^~~~~~~~~~~~~
D:\rust-serde\xml\tests\test_xml.rs:135:5: 135:18 error: the trait `serde::ser::Serialize` is not implemented for the type `test_parse_xml_value::Test` [E0277]
D:\rust-serde\xml\tests\test_xml.rs:135     test_parse_ok(&[
                                            ^~~~~~~~~~~~~
D:\rust-serde\xml\tests\test_xml.rs:135:5: 135:18 error: the trait `serde::de::Deserialize` is not implemented for the type `test_parse_xml_value::Test` [E0277]
D:\rust-serde\xml\tests\test_xml.rs:135     test_parse_ok(&[
                                            ^~~~~~~~~~~~~
D:\rust-serde\xml\tests\test_xml.rs:146:5: 146:18 error: the trait `serde::ser::Serialize` is not implemented for the type `Outer` [E0277]
D:\rust-serde\xml\tests\test_xml.rs:146     test_parse_ok(&[
                                            ^~~~~~~~~~~~~
D:\rust-serde\xml\tests\test_xml.rs:146:5: 146:18 error: the trait `serde::de::Deserialize` is not implemented for the type `Outer` [E0277]
D:\rust-serde\xml\tests\test_xml.rs:146     test_parse_ok(&[
                                            ^~~~~~~~~~~~~
D:\rust-serde\xml\tests\test_xml.rs:210:5: 210:18 error: the trait `serde::ser::Serialize` is not implemented for the type `test_parse_attributes::A` [E0277]
D:\rust-serde\xml\tests\test_xml.rs:210     test_parse_ok(&[
                                            ^~~~~~~~~~~~~
D:\rust-serde\xml\tests\test_xml.rs:210:5: 210:18 error: the trait `serde::de::Deserialize` is not implemented for the type `test_parse_attributes::A` [E0277]
D:\rust-serde\xml\tests\test_xml.rs:210     test_parse_ok(&[
                                            ^~~~~~~~~~~~~
D:\rust-serde\xml\tests\test_xml.rs:226:5: 226:18 error: the trait `serde::ser::Serialize` is not implemented for the type `test_parse_attributes::B` [E0277]
D:\rust-serde\xml\tests\test_xml.rs:226     test_parse_ok(&[
                                            ^~~~~~~~~~~~~
D:\rust-serde\xml\tests\test_xml.rs:226:5: 226:18 error: the trait `serde::de::Deserialize` is not implemented for the type `test_parse_attributes::B` [E0277]
D:\rust-serde\xml\tests\test_xml.rs:226     test_parse_ok(&[
                                            ^~~~~~~~~~~~~
D:\rust-serde\xml\tests\test_xml.rs:262:5: 262:18 error: the trait `serde::ser::Serialize` is not implemented for the type `test_parse_hierarchies::C` [E0277]
D:\rust-serde\xml\tests\test_xml.rs:262     test_parse_ok(&[
                                            ^~~~~~~~~~~~~
D:\rust-serde\xml\tests\test_xml.rs:262:5: 262:18 error: the trait `serde::de::Deserialize` is not implemented for the type `test_parse_hierarchies::C` [E0277]
D:\rust-serde\xml\tests\test_xml.rs:262     test_parse_ok(&[
                                            ^~~~~~~~~~~~~

@nicolai86
Copy link

What's the best way to play around with this pr? I'd like to parse this (http://www.ecb.europa.eu/stats/eurofxref/eurofxref-hist-90d.xml) and all necessary features should be inplace already... 👍

(I'm asking because the current version does not compile on either rust-1.0 beta nor the nightly…)

# cargo build --verbose
   Compiling serde v0.3.0 (https://github.com/oli-obk/rust-serde.git?rev=1df4e889d44d86a3413af9648ba9594753585eb1#1df4e889)
     Running `rustc /Users/nicolai86/.cargo/git/checkouts/rust-serde-2276df5409bb643b/1df4e889d44d86a3413af9648ba9594753585eb1/src/lib.rs --crate-name serde --crate-type lib -g -C metadata=570d4f9da9420900 -C extra-filename=-570d4f9da9420900 --out-dir /private/tmp/hello_world/target/debug/deps --emit=dep-info,link -L dependency=/private/tmp/hello_world/target/debug/deps -L dependency=/private/tmp/hello_world/target/debug/deps -Awarnings`
/Users/nicolai86/.cargo/git/checkouts/rust-serde-2276df5409bb643b/1df4e889d44d86a3413af9648ba9594753585eb1/src/json/error.rs:142:6: 142:33 error: use of undeclared trait name `error::FromError`
/Users/nicolai86/.cargo/git/checkouts/rust-serde-2276df5409bb643b/1df4e889d44d86a3413af9648ba9594753585eb1/src/json/error.rs:142 impl error::FromError<io::Error> for Error {
                                                                                                                                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~
/Users/nicolai86/.cargo/git/checkouts/rust-serde-2276df5409bb643b/1df4e889d44d86a3413af9648ba9594753585eb1/src/xml/error.rs:73:6: 73:33 error: use of undeclared trait name `error::FromError`
/Users/nicolai86/.cargo/git/checkouts/rust-serde-2276df5409bb643b/1df4e889d44d86a3413af9648ba9594753585eb1/src/xml/error.rs:73 impl error::FromError<io::Error> for Error {
                                                                                                                                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~
error: aborting due to 2 previous errors
Could not compile `serde`.

Caused by:
  Process didn't exit successfully: `rustc /Users/nicolai86/.cargo/git/checkouts/rust-serde-2276df5409bb643b/1df4e889d44d86a3413af9648ba9594753585eb1/src/lib.rs --crate-name serde --crate-type lib -g -C metadata=570d4f9da9420900 -C extra-filename=-570d4f9da9420900 --out-dir /private/tmp/hello_world/target/debug/deps --emit=dep-info,link -L dependency=/private/tmp/hello_world/target/debug/deps -L dependency=/private/tmp/hello_world/target/debug/deps -Awarnings` (exit code: 101)

@oli-obk
Copy link
Member Author

oli-obk commented Apr 9, 2015

@nicolai86 it should work once rust-quasi is updated to master

@hugoduncan
Copy link
Contributor

Updated to latest master and rustc nightly - https://github.com/hugoduncan/rust-serde/tree/oli-obk-xml
Looks like I need "skip xml version tag" to be implemented for this to be useful.

@oli-obk
Copy link
Member Author

oli-obk commented Apr 15, 2015

@hugoduncan: I'm on it wrt skipping comments, version tags and similar. thanks for the rebase, I'll look into it

@hugoduncan
Copy link
Contributor

I found I needed to strip xmlns attributes from the input. It might be nice longer term to find a way of annotating the type with it's required namespace and actually produce/verify the xmlns attribute.

I also needed to strip whitespace between tags to get input to parse.

@oli-obk
Copy link
Member Author

oli-obk commented Apr 15, 2015

I also needed to strip whitespace between tags to get input to parse.

do you have an example that doesn't parse? I'll add it as a unit test.

@oli-obk
Copy link
Member Author

oli-obk commented Apr 15, 2015

I found I needed to strip xmlns attributes from the input. It might be nice longer term to find a way of annotating the type with it's required namespace and actually produce/verify the xmlns attribute.

again, do you have examples and the expected behavior? I don't know much about xmlns :/

@hugoduncan
Copy link
Contributor

@oli-obk: an example string:

"<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<DescribeInstancesResponse xmlns=\"http://ec2.amazonaws.com/doc/2014-10-01/\">\n    <requestId>9474f558-10a5-42e8-84d1-f9ee181fe943</requestId>\n    <reservationSet/>\n</DescribeInstancesResponse>"

As a first iteration I think it is reasonable to just ignore xmlns attributes.

@oli-obk
Copy link
Member Author

oli-obk commented Apr 15, 2015

I think xsd verification is out of scope of serde, but I might be wrong. Anyway, I can silently ignore it.

@oli-obk
Copy link
Member Author

oli-obk commented Apr 15, 2015

your example string works now

@hugoduncan
Copy link
Contributor

Another example. vpcSet is a sequence, that can have many items, each of which is a struct with vpcId and state fields. Currently I can't seem to construct anything that will deserialise the item. I think this might require some sort of attribute, e.g. #[serde(rename='item')], to set the expected tag name on the struct for item.

<?xml version="1.0" encoding="UTF-8"?>
<DescribeVpcsResponse xmlns="http://ec2.amazonaws.com/doc/2014-10-01/">
    <requestId>8d521e9a-509e-4ef6-bbb7-9f1ac0d49cd1</requestId>
    <vpcSet>
        <item>
            <vpcId>vpc-ba0d18d8</vpcId>
            <state>available</state>
        </item>
    </vpcSet>
</DescribeVpcsResponse>

@oli-obk
Copy link
Member Author

oli-obk commented Apr 16, 2015

@erickt: I think I need to extend serde with "fixed size sequences" to get sequences of sequences to work. http://users.rust-lang.org/t/rfc-serde-xml-support/737/9

@oli-obk
Copy link
Member Author

oli-obk commented Apr 16, 2015

@hugoduncan actually this is already possible:

let s = r#"
<?xml version="1.0" encoding="UTF-8"?>
<DescribeVpcsResponse xmlns="http://ec2.amazonaws.com/doc/2014-10-01/">
    <requestId>8d521e9a-509e-4ef6-bbb7-9f1ac0d49cd1</requestId>
    <vpcSet>
        <item>
            <vpcId>vpc-ba0d18d8</vpcId>
            <state>available</state>
        </item>
    </vpcSet>
</DescribeVpcsResponse>"#;
#[derive(PartialEq, Debug, Serialize, Deserialize)]
#[allow(non_snake_case)]
struct Item {
    vpcId: String,
    state: String,
}
#[derive(PartialEq, Debug, Serialize, Deserialize)]
struct Helper {
    item: Vec<Item>,
}
#[derive(PartialEq, Debug, Serialize, Deserialize)]
#[allow(non_snake_case)]
struct DescribeVpcsResponse {
    requestId: String,
    vpcSet: Helper,
}
test_parse_ok(&[
    (
        s,
        DescribeVpcsResponse {
            requestId: "8d521e9a-509e-4ef6-bbb7-9f1ac0d49cd1".to_string(),
            vpcSet: Helper { item: vec![ Item {
                vpcId: "vpc-ba0d18d8".to_string(),
                state: "available".to_string(),
            }]},
        },
    ),
]);

@oli-obk
Copy link
Member Author

oli-obk commented Apr 16, 2015

This would also give you the (in my opinion) expected response.vpcSet.item[5] syntax. Not sure if you meant to access response.vpcSet[5]. In that case some additional attributes would be required that manipulate the parsing just for xml... I'll add it to the list of maybe stuff.

@erickt
Copy link
Member

erickt commented Apr 26, 2015

@oli-obk: Would you like to land this in a https://github.com/serde-rs repo?

@oli-obk
Copy link
Member Author

oli-obk commented Apr 28, 2015

@erickt sounds like a plan. a separate crate should speed up compilation quite a bit.

@oli-obk
Copy link
Member Author

oli-obk commented Apr 28, 2015

closed in favor of https://github.com/oli-obk/serde-xml

@oli-obk oli-obk closed this Apr 28, 2015
@oli-obk oli-obk deleted the xml branch May 19, 2015 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants