-
-
Notifications
You must be signed in to change notification settings - Fork 31.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ElementTree ProcessingInstruction uses character entities in content #46995
Comments
In the ElementTree and cElementTree implementations in Python 2.5 (and >>> from xml.etree.ElementTree import *
>>> tostring(ProcessingInstruction('test', '<testing&>'))
'<?test <testing&>?>'
>>> from xml.etree.cElementTree import *
>>> tostring(ProcessingInstruction('test', '<testing&>'))
'<?test <testing&>?>' The XML 1.0 spec is rather vague on whether character entities are "The ampersand character (&) and the left angle bracket (<) MUST NOT So, XML reserved chars don't need converting in PIs (the only string Breaks generated PHP: >>> from xml.etree.cElementTree import *
>>> doc = Element('html')
>>> SubElement(doc, 'head')
<Element 'head' at 0x2af4e3b8a9f0>
>>> SubElement(doc, 'body')
<Element 'body' at 0x2af4e3b922a0>
>>> doc[1].append(ProcessingInstruction('php', 'if (2 < 1) print
"<p>Something has gone horribly wrong!</p>";'))
>>> tostring(doc)
'<html><head /><body><?php if (2 < 1) print "<p>Something has
gone horribly wrong!</p>";?></body></html>' Different from xml.dom: >>> from xml.dom.minidom import *
>>> i = getDOMImplementation()
>>> doc = i.createDocument(None, 'html', None)
>>> doc.documentElement.appendChild(doc.createElement('head'))
<DOM Element: head at 0x8c6170>
>>> doc.documentElement.appendChild(doc.createElement('body'))
<DOM Element: body at 0x8c6290>
>>>
doc.documentElement.lastChild.appendChild(doc.createProcessingInstruction('test',
'<testing&>'))
<xml.dom.minidom.ProcessingInstruction instance at 0x8c63b0>
>>> doc.toxml()
'<?xml version="1.0" ?>\n<html><head/><body><?test <testing&>?></body></
html>' Different from lxml: >>> from lxml.etree import *
>>> tostring(ProcessingInstruction('test', '<testing&>'))
'<?test <testing&>?>' I suspect the only change necessary to fix this is to replace the Index: elementtree/ElementTree.py --- elementtree/ElementTree.py (revision 511)
+++ elementtree/ElementTree.py (working copy)
@@ -663,9 +663,9 @@
# write XML to file
tag = node.tag
if tag is Comment:
- file.write("<!-- %s -->" % _escape_cdata(node.text,
encoding))
+ file.write("<!-- %s -->" % _encode(node.text, encoding))
elif tag is ProcessingInstruction:
- file.write("<?%s?>" % _escape_cdata(node.text, encoding))
+ file.write("<?%s?>" % _encode(node.text, encoding))
else:
items = node.items()
xmlns_items = [] # new namespaces in this scope Sorry I haven't got a similar patch for cElementTree. I've had a quick |
cElementTree.ElementTree is a copy of ElementTree.ElementTree with the The copying of the ElementTree class into cElementTree happens in the |
Patch which includes the given fix and adds a test case to cover this |
Previous patch was missing two lines in the test case. Correct fix uploaded |
Issue also effects p3k. Adapted patch attached. |
Previous upload of issue_2746 was corrupt. Fixed version uploaded. |
Can you include the cElementTree fix and test case in your patch as well? |
Oops, sorry, I hadn't read your message about the patch also correcting |
I've committed the patch in r78125 (trunk) and r78126 (py3k). I'm not sure I want to backport it to 2.6/3.1, since it might bite people who relied on the old behaviour. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: