New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Facebook's RSSes aren't displayed correctly #754
Comments
It looks like an encoding problem. When I open the feed with Firefox, I have the same problem. I think it is not specific to FRSS. |
Yes and SimplePie demo and Liferea work great: that's why I ask @AugierLe42e to open a ticket. I didn't inspect code deeper but it could be a problem from FRSS. By the way, this problem reminds me another but I cannot find it in ticket archive. |
Maybe this one : #661 ? It is still opened. |
This RSS2 feed uses a CDATA container together with HTML-entities in the There is also the Atom 1.0 version of the same feed: https://www.facebook.com/feeds/page.php?format=atom10&id=166081103477984
|
Here is another one causing problem : |
Similar problem with http://www.atterres.org/rss.xml : among the many errors, this feed is using HTML in |
Would this be difficult to detect and sanitize the title ? E.g by using regexp ? |
@AugierLe42e The problem would be to avoid breaking other valid feeds (it is a valid use-case to want to write e.g. |
Caused searches such as "intitle:&" to fail after paging, and possible XSS vulnerabilities. Discovered during #754
Any news from this ? This is really annoying. |
I will look at it. For reference, it relates to SIMPLEPIE_CONSTRUCT_MAYBE_HTML in SimplePie https://github.com/FreshRSS/FreshRSS/blob/beta/lib/SimplePie/SimplePie/Sanitize.php#L250 |
FreshRSS#754 Needs to check with many feeds to see if this does not introduce incompatibilities with some valid feeds.
Here is a patch: #813 |
@AugierLe42e I have made a new version, which now always decode XML-escaped content of type SIMPLEPIE_CONSTRUCT_MAYBE_HTML. It should have a more stable and predictable behaviour. Please test if you can #813 (comment) |
BTW, there is a bug report in SimplePie simplepie/simplepie#350 so we could consider sending this patch upstream if it works well. |
So, it seems to work fine, but feel free to re-open in case of problem. |
Patch for simplepie#350 It is quite common for feeds (e.g. Facebook as reported above) to have some MAYBE_HTML section HTML-encoded, which is currently not handled by SimplePie ```xml <title><![CDATA[ L&simplepie#39;alpha 11 est arrivée...]]></title> ``` This is the approach currently used (with success) in FreshRSS: FreshRSS/FreshRSS#754 FreshRSS/FreshRSS#813
Yeah, work fine now. Thanks ! |
One picture is worth a thousand words :
Here is the RSS that causes problems :
https://www.facebook.com/feeds/page.php?format=rss20&id=166081103477984
The text was updated successfully, but these errors were encountered: