-
-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
etree serializer #849
Comments
We have the PycodeSerializer, maybe we could do something similar to produce an etree object |
@tefra one of the things that would be nice to have is to be able to allow bigger than memory documents, and replace the contents following the same structure.
Some modifications are now done on the service_journeys. Because xsData (after parsing) removes the contents of the element, it is not possible to use the initial parse anymore. So when reopening the document, it becomes possible replace the part of this document. sjs = getIndex(service_journeys)
keys = set(sjs.keys())
parser = lxml.etree.XMLParser(remove_blank_text=True)
tree = lxml.etree.parse("/tmp/Flix_Line_x400.xml", parser=parser)
for element in tree.iterfind(".//{http://www.netex.org.uk/netex}ServiceJourney"):
if element.attrib['id'] in keys:
element.getparent().replace(element, lxml.etree.fromstring(serializer.render(sjs[element.attrib['id']], ns_map).encode('utf-8'), parser)) As can be seen from the above, this requires the xsData object to be rendered to a string, then to read it from a string, and then to replace it. I think with the direct write_etree or something similar would skip the serialiser/deserialiser overhead. |
For creative use of this amazing library I'd like to propose to include a new serializer to render to an etree Element instead of a file/string.
The already existing ability to parse from Element objects has been very useful, but I don't see its complement in the current API.
Ideally the implementation would support both vanilla etree and lxml (like the parser API does with some configuration).
The use-case would be to mix xsData with existing code that has an etree model. Another more specific use-case is using XSLT on xsdata output (and other lxml related operations).
As PoC we can do a kludgy work-around to just parse the xsData serialized xml-string and continue from there but this is obviously not ideal.
Looking at the code I see XmlSerializer and XmlWriter (and the Converters etc) and I'm not sure where this would fit in. Maybe we'd still want the Converters to go back to etree-level data (like a fresh parse)? I don't know if the most efficient etree would be build with a XmlSerializer or XmlWriter subclass, or a different structure altogether. I can see how it could be made to work with the XmlWriter events but it seems like overhead? To be fair these two classes' IO oriented API's don't quite fit this use-case as-is.
Anyway; I think the etree serializer functionality would be a natural addition to the API to go with the etree parser.
Looking forward to your thoughts on this.
The text was updated successfully, but these errors were encountered: