Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

W3C validated feed fails to parse: http://local.sfgate.com/177387/blog/rss/ #418

Closed
dhilowitz opened this issue Sep 25, 2015 · 8 comments
Closed

Comments

@dhilowitz
Copy link

Hi all,

The following feed fails to parse using SimplePie: http://local.sfgate.com/177387/blog/rss/

The feed validates using every validator I've found. See here: https://validator.w3.org/feed/check.cgi?url=http%3A%2F%2Flocal.sfgate.com%2F177387%2Fblog%2Frss%2F

I've tried it both using code and via the demo page: http://simplepie.org/demo/?feed=http%3A%2F%2Flocal.sfgate.com%2F177387%2Fblog%2Frss%2F

Both claim that there is no feed at that address. When I do force_feed(true), I get this error:

PHP Notice: This XML document is invalid, likely due to invalid characters. XML error: SYSTEM or PUBLIC, the URI is missing at line 1, column 48 in /vol/code/dsh/mobile_sites/vendor/simplepie/simplepie/library/SimplePie.php on line 1386
@pawlows-dog
Copy link

Greetings,
it seems we are having the same issue:
since updating to WordPress 4.3.1. (german installation) there is (just one!) specific RSS Feed which I can't get to run with the standard RSS Widget that ships with the original WordPress.
Other RSS Feeds are still working.
When I enter the URL the feed widget itself gives me the following error: "A feed could not be found at http://www.proasyl.de/de/news/?type=100. A feed with an invalid mime type may fall victim to this error, or SimplePie was unable to auto-discover it.. Use force_feed() if you are certain this URL is a real feed."
I can see the RSS Feed in my browser. I am also in contact with the guys that are providing this feed. In the beginning different RSS Feed validators were giving a few errors about the feed. By now they have overhauled their feed - and all validators are giving out "congrats" when you test their feed.
All but the standard RSS Widget in WordPress (which I now have learned seems to be SimplePie). Which just won't accept the feed.

Other feeds from other sites I can incorporate without any problems. Any ideas? Could someone test and verify that the feed at http://www.proasyl.de/de/news/?type=100 CAN be incorporated with WordPress 4.3.1. and the standard RSS Widget?

I am completely stumped, I have no idea where to look anymore and it would help me to know if it is a bug with RSS Feed Widget, or if something in my own installation is screwed up.

Cheers and thanks in advance to any kind soul that gives it a shot.

Edit to add: I am not completely sure if it really resulted from upgrading from 4.3.0 to 4.3.1 - but I noticed it shortly after the upgrade, so I think that might have caused it. The same feed above DID work once (and I believe under 4.3.0)

Miez

@mblaney
Copy link
Member

mblaney commented Nov 10, 2015

Hello @pawlows-dog a quick glance at http://www.proasyl.de/de/news/?type=100 shows there's a " in the title one of the items. That's not valid xml so SimplePie would reject it. I wrote a function to declare all html entities, but it's not part of mainline SimplePie. Also that feed times out at http://simplepie.org/demo for some reason, so I couldn't test it there, but it's valid in my version.

@pawlows-dog
Copy link

Hello Malcolm,
many thanks for your reply. I have the impression that SimplePie is a bit too picky. Every validator I test it on gives out kudos (e.g.: https://validator.w3.org/feed/check.cgi?url=http%3A%2F%2Fwww.proasyl.de%2Fde%2Fnews%2F%3Ftype%3D100, or http://www.rssboard.org/rss-validator/check.cgi?url=http%3A%2F%2Fwww.proasyl.de%2Fde%2Fnews%2F%3Ftype%3D100, or http://feedvalidator.org/check.cgi?url=http%3A%2F%2Fwww.proasyl.de%2Fde%2Fnews%2F%3Ftype%3D100, ...).
I am not deep enough into XML to be a judge of validity - but if every known validator is happy with the feed format - shouldn't SimplePie just accept that feed as well?
I am not sure what to tell the guys anymore. They overhauled their feed - all validators are happy. I can hardly ask them for more. It seems it is me who should find another parser / feed-injection tool to use on my wordpress site.
Does anyone have a recommendation? I don't need super-funky functionality. Just a lil' tool to enter the url of the feed that displays the feed - preferably in a widget.
When I search for rss plugins in wordpress there seem to be no actual and simple ones. Obviously because everyone is using the onboard SimplePie-based widget.
sigh

@dhilowitz
Copy link
Author

The following feed also exhibits this same behavior: http://local.chron.com/168499/blog/rss/

@mblaney
Copy link
Member

mblaney commented Mar 15, 2016

@dhilowitz @pawlows-dog this should be fixed now, please update your version of SimplePie and try again.

@mblaney mblaney closed this as completed Mar 15, 2016
@dhilowitz
Copy link
Author

dhilowitz commented May 11, 2016

@mblaney This issue is still happening with http://local.chron.com/168499/blog/rss/. I just tested with version 1.4.

@dhilowitz
Copy link
Author

dhilowitz commented May 11, 2016

Here's a hint. This doesn't work:

$feed = new SimplePie();
$feed->set_feed_url('http://local.chron.com/168499/blog/rss/');

But this does work:

$feed = new SimplePie();
$xml = file_get_contents('http://local.chron.com/168499/blog/rss/');
$feed->set_raw_data($xml);

In other words, it would seem that the problem is being caused by something in the way the feed is being downloaded from certain HTTP servers.

@dhilowitz
Copy link
Author

Disregard! I figured out the issue. The server was returning 403 because of SimplePie's default user agent. Looks like I have to spoof user agents.

bitmoji

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants