Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #257 and allow $text to work with tags in the text #511

Closed

Conversation

JOSEPHGILBY
Copy link

@JOSEPHGILBY JOSEPHGILBY commented Nov 18, 2022

This is a very early attempt at solving #257 (awful, but functional, code below). Unfortunately, I ran into a design issue so I'd like to open the discussion now and get your feedback on what to do.

Suppose we have the following struct to deserialize (from the test case attached in serde-de.rs below)

#[derive(Debug, Deserialize, PartialEq)]
struct Trivial<T> {
    #[serde(rename = "$text")]
    value: T,
}

The test case has the following xml

<root>style tags <em>in this document</em></root>

deserialize into

Trivial {
    value: "style tags <em>in this document</em>".to_string(),
}

The test case also assumes the xml

<outer><root>style tags <em>in this document</em></root></outer>

should not deserialize (missing field $text)

However, if we change the design for how $text works to include embedded tags, this would now be deserializable where 'outer' is the root tag and everything between can now be fed into a text field because root can be part of the string now.

Outside of the above being newly expected behavior we could make the user describe which tags can be deserialized into text fields instead of read:

#[derive(Debug, Deserialize, PartialEq)]
struct Trivial<T> {
    #[serde(rename = "$text$<EM>$<B>")] // only <EM>...</EM>s and <B>...</B>s are allowed to be part of the string
    value: T,
}

Let me know if you have any other ideas on what would be best to do here.

@Mingun Mingun added enhancement serde Issues related to mapping from Rust types to XML labels Nov 18, 2022
@Mingun
Copy link
Collaborator

Mingun commented Nov 18, 2022

The test case also assumes the xml

<outer><root>style tags <em>in this document</em></root></outer>

should not deserialize (missing field $text)

The test should be changed. It is expected, that it would return

"<root>style tags <em>in this document</em></root>"

Outside of the above being newly expected behavior we could make the user describe which tags can be deserialized into text fields instead of read:

I think, it is better just to have a flag is we want to have inner tags inside or not.

@JOSEPHGILBY
Copy link
Author

JOSEPHGILBY commented Nov 19, 2022

@Mingun What do you think would be a good format for indicating that inside tags can be accepted for the text field? Maybe split $text into $text_including_tags and $text_without_tags? Or maybe keep $text and create $text_with_tags. $text should be able to handle embedded tags if there is a <![CDATA[]]> wrapping it (and I think quick-xml already handles this case), but I currently have a document where they didn't wrap the text with CDATA. Instead, there's an accompanying DTD file containing the following lines

<!ELEMENT TEST (#PCDATA | EM | B)*>
.
.
.
<!ELEMENT EM (#PCDATA | B)*>
<!ELEMENT B (#PCDATA | EM)*>

Since quick-xml doesn't currently parse DTD, but I think you did mention that you would like quick-xml to somewhat follow the spirit behind DTD, I was thinking that's why we might want the user to specify which specific tags could be expected when serializing a $text field.

@Mingun
Copy link
Collaborator

Mingun commented Nov 21, 2022

Probably $html (when check_end_names should be disabled) and $xml (when not) will be the most obvious choice.

The feature should be carefully designed, because there is a lot of edge cases. For example:

<text><!-- include?-->content with <em>tags</em><text>
<text>content <!-- ...or this? -->with <em>tags</em><text>

@dralley
Copy link
Collaborator

dralley commented Jul 10, 2023

@JOSEPHGILBY Do you intend to continue this work?

@JOSEPHGILBY
Copy link
Author

Unfortunately, I don't have the time.

@lkirkwood
Copy link

How would i go about starting a PR for this? Has anyone found a workaround? My problem is closer to #257.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement serde Issues related to mapping from Rust types to XML
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants