Skip to content

Commit

Permalink
[FreshRSS] decode special chars for MAYBE_HTML
Browse files Browse the repository at this point in the history
Patch for simplepie#350

It is quite common for feeds (e.g. Facebook as reported above) to have
some MAYBE_HTML section HTML-encoded, which is currently not handled by
SimplePie
```xml
<title><![CDATA[ L&simplepie#39;alpha 11 est arriv&#xe9;e...]]></title>
```

This is the approach currently used (with success) in FreshRSS:
FreshRSS/FreshRSS#754
FreshRSS/FreshRSS#813
  • Loading branch information
Alkarex committed Apr 5, 2015
1 parent 9a9faaa commit 7671234
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions library/SimplePie/Sanitize.php
Expand Up @@ -229,6 +229,7 @@ public function sanitize($data, $type, $base = '')
{
if ($type & SIMPLEPIE_CONSTRUCT_MAYBE_HTML)
{
$data = htmlspecialchars_decode($data, ENT_QUOTES);
if (preg_match('/(&(#(x[0-9a-fA-F]+|[0-9]+)|[a-zA-Z0-9]+)|<\/[A-Za-z][^\x09\x0A\x0B\x0C\x0D\x20\x2F\x3E]*' . SIMPLEPIE_PCRE_HTML_ATTRIBUTE . '>)/', $data))
{
$type |= SIMPLEPIE_CONSTRUCT_HTML;
Expand Down

0 comments on commit 7671234

Please sign in to comment.