Universal Feed Parser
can parse feeds whether they are well-formed XML (Extensible Markup Language)
or not. However, since some applications may wish to reject or warn users about non-well-formed feeds, Universal Feed Parser
sets the bozo
bit when it detects that a feed is not well-formed. Thanks to Tim Bray for suggesting this terminology.
>>> d = feedparser.parse('http://feedparser.org/docs/examples/atom10.xml')
>>> d.bozo
0
>>> d = feedparser.parse('http://feedparser.org/tests/illformed/rss/aaa_illformed.xml')
>>> d.bozo
1
>>> d.bozo_exception
<xml.sax._exceptions.SAXParseException instance at 0x00BAAA08>
>>> exc = d.bozo_exception
>>> exc.getMessage()
"expected '>'\\n"
>>> exc.getLineNumber()
6
There are many reasons an XML (Extensible Markup Language)
document could be non-well-formed besides this example (incomplete end tags) See advanced.encoding
for some other ways to trip the bozo bit.