Skip to content

Latest commit

 

History

History
74 lines (53 loc) · 2.49 KB

introduction.rst

File metadata and controls

74 lines (53 loc) · 2.49 KB

Introduction

Universal Feed Parser is a Python module for downloading and parsing syndicated feeds. It can handle RSS (Rich Site Summary) 0.90, Netscape RSS (Rich Site Summary) 0.91, Userland RSS (Rich Site Summary) 0.91, RSS (Rich Site Summary) 0.92, RSS (Rich Site Summary) 0.93, RSS (Rich Site Summary) 0.94, RSS (Rich Site Summary) 1.0, RSS (Rich Site Summary) 2.0, Atom 0.3, Atom 1.0, CDF (Channel Definition Format) and JSON (JavaScript Object Notation) feeds. It also parses several popular extension modules, including Dublin Core and Apple's iTunes extensions.

To use Universal Feed Parser, you will need Python 3.8 or later. Universal Feed Parser is not meant to run standalone; it is a module for you to use as part of a larger Python program.

Universal Feed Parser is easy to use; it has one primary public function, parse. parse takes a number of arguments, but only one is required, and it can be a URL (Uniform Resource Locator), a local filename, or a raw string containing feed data in any format.

Parsing a feed from a remote URL (Uniform Resource Locator)

>>> import feedparser
>>> d = feedparser.parse('$READTHEDOCS_CANONICAL_URL/examples/atom10.xml')
>>> d['feed']['title']
'Sample Feed'

Parsing a feed from a local file

The following example assumes you are on Windows, and that you have saved a feed at c:\\incoming\\atom10.xml.

Note

Universal Feed Parser works on any platform that can run Python; use the path syntax appropriate for your platform.

>>> import feedparser
>>> d = feedparser.parse(r'c:\incoming\atom10.xml')
>>> d['feed']['title']
'Sample Feed'

Universal Feed Parser can also parse a feed in memory.

Parsing a feed from a string

>>> import feedparser
>>> rawdata = """<rss version="2.0">
<channel>
<title>Sample Feed</title>
</channel>
</rss>"""
>>> d = feedparser.parse(rawdata)
>>> d['feed']['title']
'Sample Feed'

Values are returned as Python Unicode strings (except when they're not -- see advanced.encoding for all the gory details).