MergeableDocumentElement.__merge_entities__() methods are not ignored anymore. Respnosibilty to merge two documents is now moved from
Session.merge() method to
crawl() now return a set of
CrawlResult objects instead of
feeds parameter of
crawl() function was renamed to
feed_uri parameter and corresponding
feed_uri attribute to
Timeout option was added to crawler.
timeout parameter to
timeout parameter to
DEFAULT_TIMEOUT constant which is 10 seconds.
LinkList.favicon property. [
Link.relation attribute which had been optional now becomes required
AutoDiscovery.find_feed_url() method (that returned feed links) was gone. Instead
AutoDiscovery.find() method (that returns a pair of feed links and favicon links) was introduced. [
Subscription.icon_uri attribute was introduced. [
#49] Added an optional
icon_uri parameter to
SubscriptionSet.subscribe() method. [
normalize_xml_encoding() function to workaround
encoding detection bug. [ #41] Added
guess_tzinfo_by_locale() function. [
microseconds option to
Fixed incorrect merge of subscription/category deletion.
Subscriptions are now archived rather than deleted.
Outline (which is a common superclass of
Category) now has
deleted_at attribute and
rss2 parser bugs.
Now the parser accepts several malformed
It become to guess the time zone according to its
<language> and the ccTLD (if applicable) when the date time doesn't give any explicit time zone (which is also malformed). [
#41] It had ignored
<category> elements other than the last one, now it become to accept as many as there are.
It had ignored
<comments> links at all, now these become to be parsed to
Link objects with
Some RSS 2 feeds put a URI into
<generator>, so the parser now treat it as
uri rather than
value for such situation.
<enclosure> links had been parsed as
relation attribute, but it becomes to properly set the attribute to
<link> elements with Atom namespace also becomes to be parsed well.
atom parser bugs.
Now it accepts obsolete PURL Atom namespace.
Since some broken Atom feeds (e.g. Naver Blog) provide date time as RFC 822 format which is incorrect according to RFC 4287 (section 3.3), the parser becomes to accept RFC 822 format as well.
Some broken Atom feeds (e.g. Naver Blog) use
<modified> which is not standard instead of
<updated> which is standard, so the parser now treats
<modified> equivalent to
<summary> can has
text/html in addition to
<contributor> becomes ignored if it hasn't any of
Fixed a parser bug that hadn't interpret omission of
link[rel] attribute as
Fixed the parser to work well even if there's any file separator characters (FS,