-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
xml.etree.ElementTree encoding declaration should be capital ('UTF-8') rather than lowercase ('utf-8') #69235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Seems that in python3 the XML encoding declaration from xml.etree.ElementTree has changed from 2.x in that it is now lowercased, e.g. 'utf-8'. While the XML spec [1] says that decoders _SHOULD_ understand this, the encoding string _SHOULD_ be 'UTF-8'. It seems that keeping to the standard in the vein of being strictly conformant in encoding, lax in decoding will give maximum compatibility. It also seems like an unhelpful change for 2.x to 3.x migration though that is perhaps a minor issue (but how I noticed it). Can show with:
Cheers, [1] <http://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncName\> "In an encoding declaration, the values "UTF-8", "UTF-16", ... should be used for the various encodings and transformations of Unicode" and then later "XML processors should match character encoding names in a case-insensitive way". |
I agree that Python should not be converting the supplied encoding name to lowercase, although I guess reverting this has the potential to upset people’s output (e.g. if they depend on the checksum or something). |
Here is a patch which changes the code to respect the letter case specified by the user, although it still compares the special strings "unicode", "us-ascii", and "utf-8" case-insensitively, and the default encoding is still lowercase. Let me know what you think. >>> tree = ElementTree(Element('hello', {'beer': 'good'}))
>>> tree.write(stdout.buffer, encoding="UTF-8", xml_declaration=True); print()
<?xml version='1.0' encoding='UTF-8'?>
<hello beer="good" />
>>> tree.write(stdout.buffer, encoding="UTF-8"); print()
<hello beer="good" />
>>> tree.write(stdout.buffer, xml_declaration=True); print()
<?xml version='1.0' encoding='us-ascii'?>
<hello beer="good" /> |
LGTM |
Path looks fine and seems to work as expected -- Simeon |
s/Path/Patch/ |
New changeset ff7aba08ada6 by Martin Panter in branch '3.4': New changeset 9c248233754c by Martin Panter in branch '3.5': New changeset 409bab2181d3 by Martin Panter in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: