You can clone with
HTTPS or Subversion.
Thanks for this project! I'm using it for http://magnet.io and it's saving me a lot of effort.
Maybe the project should include moment.js or another date parsing library for the pubdate parsing. There's a lot of feeds where the various date entries (dc:date, pubdate, etc.) just aren't detected correctly and thus return a null. I ran into this example last night:
The date is formatted like:
As a workaround now, when I see a null, I dive into the meta object and see if moment.js can parse the date and it seems to be working well enough, but it should probably be in your library directly.
Thanks, Russ. I'm on the fence about this. I have this personal reaction, "If the publisher can't be bothered to publish valid data, then wtf do I care?!" But, feedparser is a library, not my application, so I can see where being more liberal in what we receive is the way to go here. If I can simply plug in moment.js and have "slightly invalid" dates work (such as in your example feed), then I'm cool with that. SimplePie's script is ridiculous, though.
At the least, in these edge cases, it seems best to not throw away original data in case somebody wants it.
Just to get some real numbers, I went through the 14,178 feeds I have right now and only 149 of them returned null in the pubdate. Rechecking those feeds gave me only a handful that were badly formatted, the rest are just non-existant.
So sorry - it doesn't seem to be as big of a deal as I initially thought. I ran into the spinner.ca thing early and assumed it would be more common, but it doesn't look that way. I don't think it's a an issue any more.
@russellbeattie Russ, thanks for following up. I still wouldn't object if moment could make the date parsing more reliable.
@rdbcci The original data is always in the un-normalized property. For example, with an Atom 1.0 feed, each post's "pubdate" property would have the normalized Date object (unless it cannot be parsed), and the "atom:published" property would retain whatever string was in the feed.