Universal Feed Parser
is a Python
module for downloading and parsing syndicated feeds. It can handle RSS (Rich Site Summary)
0.90, Netscape RSS (Rich Site Summary)
0.91, Userland RSS (Rich
Site Summary)
0.91, RSS (Rich Site Summary)
0.92, RSS (Rich
Site Summary)
0.93, RSS (Rich Site Summary)
0.94, RSS (Rich
Site Summary)
1.0, RSS (Rich Site Summary)
2.0, Atom 0.3, Atom 1.0, and CDF (Channel Definition Format)
feeds. It also parses several popular extension modules, including Dublin Core and Apple's iTunes
extensions.
To use Universal Feed Parser
, you will need Python
3.6 or later. Universal Feed Parser
is not meant to run standalone; it is a module for you to use as part of a larger Python
program.
Universal Feed Parser
is easy to use; the module is self-contained in a single file, feedparser.py
, and it has one primary public function, parse
. parse
takes a number of arguments, but only one is required, and it can be a URL (Uniform Resource Locator)
, a local filename, or a raw string containing feed data in any format.
:
>>> import feedparser
>>> d = feedparser.parse('http://feedparser.org/docs/examples/atom10.xml')
>>> d['feed']['title']
u'Sample Feed'
The following example assumes you are on Windows, and that you have saved a feed at c:\\incoming\\atom10.xml
.
Note
Universal Feed Parser
works on any platform that can run Python
; use the path syntax appropriate for your platform.
:
>>> import feedparser
>>> d = feedparser.parse(r'c:\incoming\atom10.xml')
>>> d['feed']['title']
u'Sample Feed'
Universal Feed Parser
can also parse a feed in memory.
:
>>> import feedparser
>>> rawdata = """<rss version="2.0">
<channel>
<title>Sample Feed</title>
</channel>
</rss>"""
>>> d = feedparser.parse(rawdata)
>>> d['feed']['title']
u'Sample Feed'
Values are returned as Python
Unicode strings (except when they're not -- see advanced.encoding
for all the gory details).