(waiting for endorsements)
HTML with only content, definition and conversion tools. HTML5 simplified tag subset.
Digtal content preservation repositories... and many online tools with a "HTML upload" interface, offline softwares with "document loading", input modules of Content Management Systems,
of Document Management Systems (DMSs and old EDMs)... They supposed that user is sending full-text content in a whole HTML document (only relevant content are into the tag
body). This is because HTML is the "lingua franca" and the best way to do content-interchange.
It is a simplified HTML for non-interactive content, the HTML-OnlyContent.
The HTML5-onlyContent is a content tag suite for XML or HTML formats, used to describe an HTML format that can be used as "content container" in databases or technical and legal literature published online. Its tag set (and attibutes) is a subset of HTML5 tag set, preserving same HTML5 DTD, strucuture and semantic rules.
Some transformation tools, compatible with the definition (see Filtering and normalizing section), are available:
- Simple XSLT tag filter:
htm_normalize.php, to ensure "same HTML source" of same content.
Another usual (complex) task in this context is to transform all the CSS, and all
class attributes, into
style attributes (see. ex. CssToInlineStyles project); where the use of old
center tags, as well normalization of bolds and italics, can be accomplished.