Removal of specialized HTML literal handling?

Possible easy solution for #2935 and #2945

The reason we forked `html5lib` to make `html5lib-modern` was because there is no new replacement for `html5lib` that provides the same XML-based HTML-tokenizing functionality that `html5lib` does. There's no alternative to move to.

Beautifulsoup4 is the logical replacement, but it includes `html5lib` in its dependency tree, so defeats the whole point.

But what if we just dropped that feature entirely? Why does RDFLib even want to be able to tokenize HTML Literals? The feature was added for a reason, but do we need to keep it?

Can we simply drop that feature, and treat HTML the same as any other string literal, and remove `html5lib` from our dependencies entirely?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Removal of specialized HTML literal handling? #2946

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Removal of specialized HTML literal handling? #2946

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions