Skip to content


Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP


XHTML exceptions for tags #10

technosophos opened this Issue · 20 comments

2 participants


Certain tags in XHTML cannot use an unary form (). QueryPath needs to handle those cases in the xhtml() method.



If the xhtml() method were to call the document save method with the LIBXML_NOEMPTYTAG option, this could be handled.


Very good idea. This is now done.

SHA 15515fb


Looks like a single regex could probably do all of the heavy lifting. I think this could be a very compelling solution to an otherwise annoying problem.

Thoughts on what the "right" list of tags to collapse is? Do you think the list in the comment you pointed to is complete?


I checked in a solution for the <br />, <hr /> etc tags. If you wanna give it a go, it's on GitHub now.

I'm looking into what the best solution to the CDATA one is. I have to make sure that the output always remains compatible with an XML parser, which means I can't remove the CDATA section without replacing it with... something. (Otherwise, a strict XML parser will choke on GT and AND operators, single and double quotes, and the like.


One solution that will keep it XML compatible but also HTML compaptible:

I'm thinking that the one will work correctly in the most number of cases, so I will probably modify your regex to do that. Does that sound about right?


I'm trying to reproduce your other reported error, too. (the one about the double-slash on the last element)


Have I got the solution for you!

The code I just checked in allows you to configure how the CDATA section gets replaced. So in the $options, you can add this:

 $options = array('escape_xhtml_js_css_sections' => QueryPath::JS_CSS_ESCAPE_NONE);
 qp($hml, $css, $options)->xhtml();

That will simply remove the CDATA parts.


Can you give me an example doc that generates this problem?

Input: <br /> <img />
Output looks like this: <br /> <img / />

I can't seem to reproduce it here.


Yes, I can see variants of the error now. I wonder if my regex is mis-matching something in the doctype declaration.


Ah, I found it.

The regular expression did not appropriately handle cases where the tag was unary already, and (oddly enough) libxml added a few on its own. So in the end, it was a two-char fix to the regular expression.


I'm going to mark this as "closed" (which is a sure-fire way of finding a new related bug). Please re-open if any additional XHTML creation errors are found.

@sdboyer sdboyer referenced this issue from a commit in sdboyer/querypath
@technosophos Fixed issue #10, fixed docs.
- The XHTML methods now do not collapse tags in unary fashion.
- Made sure to update docs for 2.1
- Credited Emily for her new contributions
- Updated unit tests
@sdboyer sdboyer referenced this issue from a commit in sdboyer/querypath
@technosophos Fixing CDATA issue in #10. 2728bfe
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.