New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minidom does not have to escape quote inside text segments #81555
Comments
I am using Minidom to pretty-print XML. But currently if there is a quote inside a text segment it escapes it to To my understanding this is unnecessary if all other symbols are escaped. This escaping makes it really ugly and defeats the purpose of me using Minidom for pretty-printing XML. |
Wouldn't be better to support this as a parameter? Escaping is pretty useful in HTML contexts |
FYI, this is exactly how ElementTree.tostring does it. So this would make ElementTree.tostring behave the same as minidom. @dhilst: I do not think a parameter is needed here. This is completely compatible with HTML. It is just that currently an additional unnecessary escaping (for both XML in HTML context) is done. |
This changes behavior in an irreversible way. A parameter would make it reversible. |
Sure, but is old behavior useful in any use case? Every bugfix changes old behavior in an irreversible way. So in which use case you want the old behavior? Can you elaborate here? |
Not really, I don't have a use case here. I'm just warning that this would break user code that relies on old behavior. Anyway is possible to add new behavior without changing the old one. A parameter would make this possible, for example. This is only my opinion, let's other argue too |
Almost 2 years later I only registered to agree with Daniel. This is extremely annoying. Use case: I am generating XMLs for Apigee, where my conditions contain quotes and there's no need to escape them. |
Sorry, to agree with @mitar |
Any chance we can take action on this? I agree with the concern. XML standard says that escaping quotes in content data is optional. minidom is forcing a change to perfectly valid XML. |
I read the spec, and this case is actually not explicitly specified. The spec says that Also, the fact that other libraries behave differently does not mean that minidom must. It's been around for more than 20 years, and the current behaviour is probably just as old and settled. According to the comments above, there seem to be use cases where this can be an annoyance, apparently, but even there, the result is not wrong, just a bit less readable than it could be. While only C14N makes guarantees w.r.t. the exact byte serialisation (so neither minidom not ElementTree need to), any change here will break someone's code out there, so there is a downside to changing the output. A deliberate change without need means deliberate breakage without need. To me, there is no definite reason to change the current behaviour. As much as I second the preference for clean XML output, I'd rather like to avoid changing the output without a compelling reason. |
The spec also doesn't say that a serializer shouldn't turn all. the Zs into Ys, but I'm pretty sure we would all consider that a bug. If user data would be valid XML, why modify it regardless of user intent? If the answer is simply "go use a different library which doesn't have an opinion on this", then I will. Just disappointed that minidom isn't usable at present for cases where a user cares about quotes in text. |
Reopening since #107947 gives us a reason to change the output, so we can solve this issue along the way. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: