-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modify serializer for xml.etree.ElementTree to allow forcing the use of long tag closing #58585
Comments
As it stands in Hg, when the write() method of an xml.etree.ElementTree object is called, and a tag within the XML tree has no child tags or defined text, the tag is written using the short notation "<tag ... />". Whether or not the short notation is used instead of the long "<tag ...></tag>" notation is used should be configurable by the programmer, without having to resort to serializing the XML into a string and then doing replace() on said string. The attached patch adds an optional parameter to the write() method that provides this choice. |
+ if text or len(elem) or long_xml: Use alternatives in order of decreasing probability. |
Hello, thanks for the patch! Since this is a new feature, I suggest discussion it on the python-ideas list first. Next, as for your patch:
|
To answer eli.bendersky's questions:
The changes I made were for the ElementTree.py file under cpython/Lib/xml/etree/ . The source for the 'ElementC14N' module is not part of Python, so I cannot modify the code for the '_serialize_c14n' function. Looks like I may need to refactor this patch to work in a way that does not alter the signature for the _serialize_* methods. |
Any progress, or can this issue be closed? |
Made a new patch. The changes within this patch do not change the signature for the _serialize_* methods, so it can be used with any third-party library that extends ElementTree. |
I don't think that the three new fields in each Element is a suitable price for this very rare used feature. |
Agree with Serhiy. Why are these flags required in Element? Also, I'm moving this to 3.4 since the patch came too late in the 3.3 process - the first beta is very soon, after which we prefer not to add new features. |
xml.sax.saxutils.XMLGenerator constructor has a parameter short_empty_elements (False by default). For consistency new ElementTree.write parameter must have the same name (True by default for compatibility). |
Ideally, this would be taken care by the _serialize_xml() with a parameter specified when called from within write(). However, the signature for the _serialize_xml() function cannot be changed, as it needs to match the signature for the rest of the _serialize_*() functions (since which serializing function is chosen from a dictionary that then calls the specific function using the same parameters. An alternative to this would be to create a single variable within the scope of ElementTree at runtime if the code calls to write out the full tags closing, and have the _serialize_xml() function check for the presence and value of that variable. I initially approached the problem via the flags on Element instead due to the perceived usefulness of giving the programmer full control on how the tree is serialized into XML. However, if I'm the only one that sees that as useful, I can certainly refactor the code to go with the above solution (or some other more elegant solution). |
I see no harm in modifying the signature of the private _serialize_* functions to accept another argument or dict of options. |
Ariel, are you interested in pursuing this issue? Serhiy, I see you assigned this to yourself - would you like to submit a patch? |
Not right now. This is low priority for me too. But I want to see this feature in 3.4. |
Well, here is a patch which add short_empty_elements flag (as for XMLGenerator) to write(), tostring() and tostringlist() methods of ElementTree. |
Patch updated (tostring() and tostringlist() refet to write() about short_empty_elements parameter). Perhaps descriptions of encoding and method parameters should not be repeated too?
Because sequences of parameters in XMLGenerator(), ElementTree.write(), ElementTree.tostring() are different and this can confuse. Also it will be easer to deprecate or rename keyword-only parameter in future (in favor of general fabric for example). I think that all optional, non-basic and very rarely used parameters should by keyword-only. |
New changeset 58168d69b496 by Eli Bendersky in branch 'default': |
I don’t think a space before the slash should be added. (It was common in the days of XHTML 1 because of an SGML parsing hack.) |
On Sun, Jan 13, 2013 at 6:09 AM, Éric Araujo <report@bugs.python.org> wrote:
Ok, will fix. |
I think Éric means different spaces, spaces in empty tags (<empty /> vs <empty/>). I don't know what the standard says about this. It should a separated issue. As for line continuations in docs, in all cases where they are occurred, a space used before a backslash for readability. I have reverted this change in 50606131a987. |
OK, thanks. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: