Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should explicitly support xml:space="preserve" #74

Closed
TPS opened this issue Dec 28, 2015 · 2 comments
Closed

Should explicitly support xml:space="preserve" #74

TPS opened this issue Dec 28, 2015 · 2 comments

Comments

@TPS
Copy link

TPS commented Dec 28, 2015

Details as to necessity at leethomason/tinyxml2#242, & JayXon/Leanify#3 (comment) & preceding.

@zeux
Copy link
Owner

zeux commented Jan 10, 2016

So it seems that the cost of this feature outweighs the benefits.

The linked issue in Leanify would be resolved by both xml:space=preserve support, or by using parse_ws_pcdata_single (or parse_ws_pcdata and some advanced client code that strips unused whitespace - parse_ws_pcdata_single is inadequate for e.g. XHTML parsing).

Interestingly enough, the issue in tinyxml2 does not have xml:space="preserve" in the document, so it's irrelevant there - this is already served by parse_ws_* parsing flags pugixml provides. In general, for XHTML documents xml:space="preserve" won't help because it's almost never present in the documents directly.

Now, there's still some value in automatic preservation of whitespace if needed. XML recommendation seems to suggest that support for this is preferred. The problem is that implementation is very involved:

  • Requires tracking this flag during parsing for the entire subtree. Since pugixml uses a stackless parser, it's not obvious how to cleanly implement this without additional storage and without the need to examine the presence of the attribute whenever you close the node
  • Requires analyzing the attribute names during parsing which adds overhead to one of the hottest paths in the parser for some documents
  • or requires inspecting the attribute state during PCDATA parsing, which is once again one of the hottest paths in the parser

Combining the implementation challenges with the fact that in most cases the applications don't seem to benefit from this means that supporting this is not worth the trouble.

@zeux zeux closed this as completed Jan 10, 2016
@zeux zeux added the wontfix label Jan 10, 2016
@TPS
Copy link
Author

TPS commented Jan 10, 2016

Thanks for investigating this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants