A list of dictionaries with details about the full content of the entry.
Atom feeds may contain multiple content elements. Clients should render as many of them as possible, based on the type and the client's abilities.
The value of this piece of content.
If this contains HTML (HyperText Markup Language)
or XHTML (Extensible HyperText Markup Language)
, it is sanitized <advanced.sanitization>
by default.
If this contains HTML (HyperText Markup Language)
or XHTML (Extensible HyperText Markup Language)
, certain (X)HTML elements within this value may contain relative URI (Uniform Resource Identifier)
s. If so, they are resolved according to a set of rules <advanced.base>
.
The content type of this piece of content.
Most likely values for `type`:
text/plain
text/html
application/xhtml+xml
For Atom feeds, the content type is taken from the type attribute, which defaults to text/plain
if not specified. For RSS (Rich Site Summary)
feeds, the content type is auto-determined by inspecting the content, and defaults to text/html
. Note that this may cause silent data loss if the value contains plain text with angle brackets. There is nothing I can do about this problem; it is a limitation of RSS (Rich Site Summary)
.
Future enhancement: some versions of RSS (Rich Site Summary)
clearly specify that certain values default to text/plain
, and Universal Feed Parser
should respect this, but it doesn't yet.
The language of this piece of content.
~entries[i].content[j].language
is supposed to be a language code, as specified by 3066
, but publishers have been known to publish random values like "English" or "German". Universal Feed Parser
does not do any parsing or normalization of language codes.
~entries[i].content[j].language
may come from the element's xml:lang attribute, or it may inherit from a parent element's xml:lang, or the Content-Language
HTTP (Hypertext Transfer Protocol)
header. If the feed does not specify a language, ~entries[i].content[j].language
will be None
, the Python
null value.
The original base URI (Uniform Resource Identifier)
for links within this piece of content.
~entries[i].content[j].base
is only useful in rare situations and can usually be ignored. It is the original base URI (Uniform Resource Identifier)
for this value, as specified by the element's xml:base attribute, or a parent element's xml:base, or the appropriate HTTP (Hypertext Transfer Protocol)
header, or the URI (Uniform Resource Identifier)
of the feed. (See advanced.base
for more details.) By the time you see it, Universal Feed Parser
has already resolved relative links in all values where it makes sense to do so. Clients should never need to manually resolve relative links.
Comes from
- /atom03:feed/atom03:entry/atom03:content
- /atom10:feed/atom10:entry/atom10:content
- /rdf:RDF/rdf:item/content:encoded
- /rss/channel/item/body
- /rss/channel/item/content:encoded
- /rss/channel/item/fullitem
- /rss/channel/item/xhtml:body