-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Zend\Dom\Query and special UTF-8 characters #7618
Comments
OK, forget my first "solution". It's bad because e.g. ... $html = '<div><h1>€</h1></div>';
$dom = new Query(utf8_decode($html));
$nodes = $dom->execute('h1');
Debug::dump($nodes->current()->nodeValue); ...will result in:
This is, because all that The real problem is, that So, based on this comment I again extended <?php
namespace MyNamespace\Dom;
use Zend\Dom\Query as ZF2Query;
class Query extends ZF2Query
{
/**
* Set document to query
*
* @param string $document
* @param null|string $encoding Document encoding
* @return Query
*/
public function setDocument($document, $encoding = null)
{
if (0 === strlen($document)) {
return $this;
}
$prepend = '';
$_encoding = empty($encoding) ? $this->getEncoding() : $encoding;
if(!empty($_encoding) && strtolower($_encoding) != 'iso-8859-1')
$prepend = sprintf('<?xml encoding="%s">', $_encoding);
// breaking XML declaration to make syntax highlighting work
if ('<' . '?xml' == substr(trim($document), 0, 5)) {
if (preg_match('/<html[^>]*xmlns="([^"]+)"[^>]*>/i', $document, $matches)) {
$this->xpathNamespaces[] = $matches[1];
return $this->setDocumentXhtml($prepend . $document, $encoding);
}
return $this->setDocumentXml($document, $encoding);
}
if (strstr($document, 'DTD XHTML')) {
return $this->setDocumentXhtml($prepend . $document, $encoding);
}
return $this->setDocumentHtml($prepend . $document, $encoding);
}
} Still, two questions remain:
|
AFAIK if no header is present the passed encoding is used, if the header is present the passed encoding is ignored. So if your documents are always in iso-8859-1 then just try |
This issue has been moved from the |
...will result in sth. like:
... will solve the problem and result in correct rendering.
For convenience I extended
Zend\Dom\Query
:Now I wonder if this could be perhaps implemented in
Zend\Dom\Query
. Or do I miss something and there's a better solution?Thanks
m.
The text was updated successfully, but these errors were encountered: