Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etree serializer #849

Closed
Bartvds opened this issue Sep 5, 2023 · 3 comments · Fixed by #975
Closed

etree serializer #849

Bartvds opened this issue Sep 5, 2023 · 3 comments · Fixed by #975

Comments

@Bartvds
Copy link

Bartvds commented Sep 5, 2023

For creative use of this amazing library I'd like to propose to include a new serializer to render to an etree Element instead of a file/string.

The already existing ability to parse from Element objects has been very useful, but I don't see its complement in the current API.

Ideally the implementation would support both vanilla etree and lxml (like the parser API does with some configuration).

The use-case would be to mix xsData with existing code that has an etree model. Another more specific use-case is using XSLT on xsdata output (and other lxml related operations).

As PoC we can do a kludgy work-around to just parse the xsData serialized xml-string and continue from there but this is obviously not ideal.

Looking at the code I see XmlSerializer and XmlWriter (and the Converters etc) and I'm not sure where this would fit in. Maybe we'd still want the Converters to go back to etree-level data (like a fresh parse)? I don't know if the most efficient etree would be build with a XmlSerializer or XmlWriter subclass, or a different structure altogether. I can see how it could be made to work with the XmlWriter events but it seems like overhead? To be fair these two classes' IO oriented API's don't quite fit this use-case as-is.

Anyway; I think the etree serializer functionality would be a natural addition to the API to go with the etree parser.

Looking forward to your thoughts on this.

@tefra
Copy link
Owner

tefra commented Sep 23, 2023

We have the PycodeSerializer, maybe we could do something similar to produce an etree object

@skinkie
Copy link
Contributor

skinkie commented Nov 13, 2023

@tefra one of the things that would be nice to have is to be able to allow bigger than memory documents, and replace the contents following the same structure.

    tree = lxml.etree.parse("/tmp/Flix_Line_x400.xml")
    for element in tree.iterfind(".//{http://www.netex.org.uk/netex}ServiceJourney"):
        service_journey: ServiceJourney
        service_journey = parser.parse(element, ServiceJourney)
        service_journeys.append(service_journey)

Some modifications are now done on the service_journeys. Because xsData (after parsing) removes the contents of the element, it is not possible to use the initial parse anymore. So when reopening the document, it becomes possible replace the part of this document.

    sjs = getIndex(service_journeys)
    keys = set(sjs.keys())
    parser = lxml.etree.XMLParser(remove_blank_text=True)
    tree = lxml.etree.parse("/tmp/Flix_Line_x400.xml", parser=parser)
    for element in tree.iterfind(".//{http://www.netex.org.uk/netex}ServiceJourney"):
        if element.attrib['id'] in keys:
            element.getparent().replace(element, lxml.etree.fromstring(serializer.render(sjs[element.attrib['id']], ns_map).encode('utf-8'), parser))

As can be seen from the above, this requires the xsData object to be rendered to a string, then to read it from a string, and then to replace it. I think with the direct write_etree or something similar would skip the serialiser/deserialiser overhead.

tefra added a commit that referenced this issue Mar 11, 2024
tefra added a commit that referenced this issue Mar 11, 2024
tefra added a commit that referenced this issue Mar 11, 2024
tefra added a commit that referenced this issue Mar 11, 2024
@tefra
Copy link
Owner

tefra commented Mar 11, 2024

Give it a look

https://xsdata.readthedocs.io/en/latest/data_binding/tree_serializing/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants