Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out of memory trying to parse broken feed #142

Closed
MidnightLightning opened this issue Jun 10, 2011 · 7 comments
Closed

Out of memory trying to parse broken feed #142

MidnightLightning opened this issue Jun 10, 2011 · 7 comments

Comments

@MidnightLightning
Copy link

The Washington Post RSS feed (http://feeds.washingtonpost.com/rss/world) is currently broken, and not a valid ATOM feed, and SimplePie is failing to detect a bad feed, and runs into an infinite loop (in the "tag_open" function) and runs out of memory, causing a fatal crash on my site. Not quite sure what the issue is with the feed, but hopefully simplepie can be updated to detect such corruption and not crash along with it in the future?

@MidnightLightning
Copy link
Author

Taking a closer look, it may be that the feed itself is 1.7 MB in size in full (yes, a 37,000 line text file), or it may be that some of the ATOM "doc" elements don't actually have a "story" sub element (only an "rss" and "dbMetadata" sub-element). An extract of the two types of "doc" elements this feed has is at https://gist.github.com/1019236.

@rmccue
Copy link
Contributor

rmccue commented Jun 20, 2011

The doc tag isn't actually part of any specification, so I have no idea what its purpose is. I can't see it in the linked feed either, so I have no idea where that has appeared from.

That said, it sounds like you are just running out of memory, due to the size of the feed, if it is actually 1.7MB.

@MidnightLightning
Copy link
Author

PHP was set to allow up to 32 MB of memory (the exact error I was getting was "Allowed memory size of 33554432 bytes exhausted (tried to allocate 40 bytes) in /path/to/simplepie.inc on line 14588", where the 40 bytes varied, and the line number by a little, though I find it a little bit of a stretch to think that a 1.7 MB file would need more than 30 times that amount of memory space to be parsed...

@X4
Copy link

X4 commented Oct 18, 2011

Just for the protocol: Here's another report of this bug in the wild with SimplePie version 1.2 in use:
http://www.concrete5.org/index.php?cID=215967&editmode=1

@rgriffith
Copy link

Not sure if this is totally related, but I ran into an infinite loop using the following snippet (SimplePie v1.2.1-dev):

$feed = new SimplePie();

$feed->set_feed_url('http://news.yahoo.com/rss/us');

$feed->set_item_limit(20);

$feed->handle_content_type();

$feed->init();

die(var_dump($feed->get_items()));

@rmccue
Copy link
Contributor

rmccue commented Dec 3, 2011

@rgriffith That's just a side-effect of recursive references (the item points to the feed, which points to the item, which points to the feed, etc).

@rmccue
Copy link
Contributor

rmccue commented Jan 16, 2012

@X4 That's an instance of the autodiscovery bug documented in #37, and is now fixed.

As for this issue itself, I'm going to close it, since it was an attempt to parse an invalid feed (that I'm not sure was anything like a feed at all). The issue no longer occurs, so I'm going to close this as invalid. If the issue does crop up again, please reopen this issue or open another one.

@rmccue rmccue closed this as completed Jan 16, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants