New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Namespaces #6
Comments
I have no strong opinion on this, but I lean towards one of the first two options. The third option seems unnecessary cluttered and error-prone (easier to be messed up by plugins). Since namespaces are naturally nested, it seems logical to have a rule that a namespace is determined by closest |
👍 I’m leaning towards the first. It’ll be easy to add (bookkeeping is already sone when parsing, and one walk down could opt to add only necessary namespaces), and it’ll be easy to handle for plug-in authors. |
Is there anything else that namespaces would be used for between parsing and compilation, except to determine the proper tag/attribute casing of an element? |
I can think of lots of parse differences, but those are handled internally (I’m about to switch to a much better parser, parse5) already. Then, there’s compilation differences, but those can of course be handled there pretty OK (as it’s in rehype-stringify). Major use case for user-land would be to not walk into SVG / MathML by accident, I think. Hmm. That can be checked easily by determining whether an element is |
Have you thought about making it explicit then? Maybe some other property like
How would it be different from namespaces? |
Not really different, just one less property.
That’s also possible, and it would make HAST more like programming language syntax trees. There’s also an edge case where HTML is in SVG/MathML, which itself is in HTML. If either namespaces or subroots were used, it would be possible to walk into trees, remembering the current namespace, and transforming just the HTML namespaced elements. |
Not doing it for now. Maybe a utility which, when given a node, checks if it’s a foreign element, would do the trick. |
Linking to rehypejs/rehype#2 (comment). |
I thought about it a bit and I’d like to work on this now. For starters, this issue is now first tracked in wooorm/property-information#6. When that is done, we can work on updating it throughout the ecosystem. I think we may be able to do without namespaces. But maybe we need to have, just like I’ll close this now, again, if anyone has any further comments please post them there! |
TL;DR
I’m thinking out loud. We need namespace information. I can think of three solutions. Not sure which is best.
Introduction
HTML has the concept of elements: things like
<strong></strong>
are normal elements. There’s a subcategory of “foreign elements”: those from MathML (mi
) or from SVG (rect
).A practical example of why this information is needed is because of tag-name normalisation: in HTML, tag-names are case-insensitive. In SVG or MathML, they are not. And, unfortunately tag-names themselves cannot be used to detect whether an element is foreign or not, because there are elements which exist in multiple spaces. For example:
var
in HTML and MathML, anda
in HTML and SVG.Take the following code:
When running the following script:
Yields:
Note 1: Non-foreign elements break out of their foreign context.
Note 2: HTML is case-insensitive (normalised to upper-case), foreign elements are case-sensitive.
Proposal
I propose either of the following:
namespace
on some nodes (notably,root
,<mathml>
,<svg>
). To determine the namespace of a node, check its closest ancestor with a namespace.namespace
onroot
nodes (and wrap<svg>
and<mathml>
inroot
s). To determine the namespace of a node, check its closestroot
for a namespace. This changes the semantics ofroot
s somewhat.namespace
on any element.The downsides of the first two as that it’s hard to determine the namespace from an element in a syntax tree without ancestral getters. However, both make moving nodes around quite easy.
The latter is verbose, but does allow for easy access. However, it makes it easy for things to go wrong when shuffling nodes around.
Note: detecting namespaces upon creation (in
rehype-parse
), is very do-able. I’d like to make the usage ofhastscript
and transformers very easy too, though!Do let me know about your thoughts on this!
The text was updated successfully, but these errors were encountered: