Skip to content

Latest commit

 

History

History
103 lines (72 loc) · 3.77 KB

reference-entry-content.rst

File metadata and controls

103 lines (72 loc) · 3.77 KB

:pyentries[i].content

A list of dictionaries with details about the full content of the entry.

Atom feeds may contain multiple content elements. Clients should render as many of them as possible, based on the type and the client's abilities.

:pyentries[i].content[j].value

The value of this piece of content.

If this contains HTML (HyperText Markup Language) or XHTML (Extensible HyperText Markup Language), it is sanitized <advanced.sanitization> by default.

If this contains HTML (HyperText Markup Language) or XHTML (Extensible HyperText Markup Language), certain (X)HTML elements within this value may contain relative URI (Uniform Resource Identifier)s. If so, they are resolved according to a set of rules <advanced.base>.

:pyentries[i].content[j].type

The content type of this piece of content.

Most likely values for `type`:

  • text/plain
  • text/html
  • application/xhtml+xml

For Atom feeds, the content type is taken from the type attribute, which defaults to text/plain if not specified. For RSS (Rich Site Summary) feeds, the content type is auto-determined by inspecting the content, and defaults to text/html. Note that this may cause silent data loss if the value contains plain text with angle brackets. There is nothing I can do about this problem; it is a limitation of RSS (Rich Site Summary).

Future enhancement: some versions of RSS (Rich Site Summary) clearly specify that certain values default to text/plain, and Universal Feed Parser should respect this, but it doesn't yet.

:pyentries[i].content[j].language

The language of this piece of content.

:py~entries[i].content[j].language is supposed to be a language code, as specified by 3066, but publishers have been known to publish random values like "English" or "German". Universal Feed Parser does not do any parsing or normalization of language codes.

:py~entries[i].content[j].language may come from the element's xml:lang attribute, or it may inherit from a parent element's xml:lang, or the Content-Language HTTP (Hypertext Transfer Protocol) header. If the feed does not specify a language, :py~entries[i].content[j].language will be None, the Python null value.

:pyentries[i].content[j].base

The original base URI (Uniform Resource Identifier) for links within this piece of content.

:py~entries[i].content[j].base is only useful in rare situations and can usually be ignored. It is the original base URI (Uniform Resource Identifier) for this value, as specified by the element's xml:base attribute, or a parent element's xml:base, or the appropriate HTTP (Hypertext Transfer Protocol) header, or the URI (Uniform Resource Identifier) of the feed. (See advanced.base for more details.) By the time you see it, Universal Feed Parser has already resolved relative links in all values where it makes sense to do so. Clients should never need to manually resolve relative links.

Comes from

  • /atom03:feed/atom03:entry/atom03:content
  • /atom10:feed/atom10:entry/atom10:content
  • /rdf:RDF/rdf:item/content:encoded
  • /rss/channel/item/body
  • /rss/channel/item/content:encoded
  • /rss/channel/item/fullitem
  • /rss/channel/item/xhtml:body