Skip to content

Latest commit

 

History

History
36 lines (29 loc) · 1.18 KB

bozo.rst

File metadata and controls

36 lines (29 loc) · 1.18 KB

Bozo Detection

Universal Feed Parser can parse feeds whether they are well-formed XML (Extensible Markup Language) or not. However, since some applications may wish to reject or warn users about non-well-formed feeds, Universal Feed Parser sets the bozo bit when it detects that a feed is not well-formed. Thanks to Tim Bray for suggesting this terminology.

Detecting a non-well-formed feed

>>> d = feedparser.parse('http://feedparser.org/docs/examples/atom10.xml')
>>> d.bozo
0
>>> d = feedparser.parse('http://feedparser.org/tests/illformed/rss/aaa_illformed.xml')
>>> d.bozo
1
>>> d.bozo_exception
<xml.sax._exceptions.SAXParseException instance at 0x00BAAA08>
>>> exc = d.bozo_exception
>>> exc.getMessage()
"expected '>'\\n"
>>> exc.getLineNumber()
6

There are many reasons an XML (Extensible Markup Language) document could be non-well-formed besides this example (incomplete end tags) See advanced.encoding for some other ways to trip the bozo bit.