etree serializer #849

Bartvds · 2023-09-05T18:22:41Z

For creative use of this amazing library I'd like to propose to include a new serializer to render to an etree Element instead of a file/string.

The already existing ability to parse from Element objects has been very useful, but I don't see its complement in the current API.

Ideally the implementation would support both vanilla etree and lxml (like the parser API does with some configuration).

The use-case would be to mix xsData with existing code that has an etree model. Another more specific use-case is using XSLT on xsdata output (and other lxml related operations).

As PoC we can do a kludgy work-around to just parse the xsData serialized xml-string and continue from there but this is obviously not ideal.

Looking at the code I see XmlSerializer and XmlWriter (and the Converters etc) and I'm not sure where this would fit in. Maybe we'd still want the Converters to go back to etree-level data (like a fresh parse)? I don't know if the most efficient etree would be build with a XmlSerializer or XmlWriter subclass, or a different structure altogether. I can see how it could be made to work with the XmlWriter events but it seems like overhead? To be fair these two classes' IO oriented API's don't quite fit this use-case as-is.

Anyway; I think the etree serializer functionality would be a natural addition to the API to go with the etree parser.

Looking forward to your thoughts on this.

tefra · 2023-09-23T11:43:17Z

We have the PycodeSerializer, maybe we could do something similar to produce an etree object

skinkie · 2023-11-13T13:13:15Z

@tefra one of the things that would be nice to have is to be able to allow bigger than memory documents, and replace the contents following the same structure.

    tree = lxml.etree.parse("/tmp/Flix_Line_x400.xml")
    for element in tree.iterfind(".//{http://www.netex.org.uk/netex}ServiceJourney"):
        service_journey: ServiceJourney
        service_journey = parser.parse(element, ServiceJourney)
        service_journeys.append(service_journey)

Some modifications are now done on the service_journeys. Because xsData (after parsing) removes the contents of the element, it is not possible to use the initial parse anymore. So when reopening the document, it becomes possible replace the part of this document.

    sjs = getIndex(service_journeys)
    keys = set(sjs.keys())
    parser = lxml.etree.XMLParser(remove_blank_text=True)
    tree = lxml.etree.parse("/tmp/Flix_Line_x400.xml", parser=parser)
    for element in tree.iterfind(".//{http://www.netex.org.uk/netex}ServiceJourney"):
        if element.attrib['id'] in keys:
            element.getparent().replace(element, lxml.etree.fromstring(serializer.render(sjs[element.attrib['id']], ns_map).encode('utf-8'), parser))

As can be seen from the above, this requires the xsData object to be rendered to a string, then to read it from a string, and then to replace it. I think with the direct write_etree or something similar would skip the serialiser/deserialiser overhead.

Resolves #849

tefra · 2024-03-11T10:38:45Z

Give it a look

https://xsdata.readthedocs.io/en/latest/data_binding/tree_serializing/

tefra added a commit that referenced this issue Mar 11, 2024

feat: Add xml and lxml tree serializers

68a74ff

Resolves #849

tefra mentioned this issue Mar 11, 2024

feat: Add xml and lxml tree serializers #975

Merged

4 tasks

tefra added a commit that referenced this issue Mar 11, 2024

feat: Add xml and lxml tree serializers

9f40f5c

Resolves #849

tefra added a commit that referenced this issue Mar 11, 2024

feat: Add xml and lxml tree serializers

6d71a64

Resolves #849

tefra added a commit that referenced this issue Mar 11, 2024

feat: Add xml and lxml tree serializers

d0e2d56

Resolves #849

tefra closed this as completed in #975 Mar 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

etree serializer #849

etree serializer #849

Bartvds commented Sep 5, 2023

tefra commented Sep 23, 2023

skinkie commented Nov 13, 2023

tefra commented Mar 11, 2024

etree serializer #849

etree serializer #849

Comments

Bartvds commented Sep 5, 2023

tefra commented Sep 23, 2023

skinkie commented Nov 13, 2023

tefra commented Mar 11, 2024