Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make custom attribute rules consistent with custom element name rules #2271

Open
LeaVerou opened this issue Jan 17, 2017 · 86 comments
Open

Make custom attribute rules consistent with custom element name rules #2271

LeaVerou opened this issue Jan 17, 2017 · 86 comments

Comments

@LeaVerou
Copy link

@LeaVerou LeaVerou commented Jan 17, 2017

Related WICG discussion: https://discourse.wicg.io/t/relaxing-restrictions-on-custom-attribute-names/1444

Currently, custom attributes need to start with data-. For frameworks with a lot of attributes (Angular, Vue etc), this introduces a serious problem: Either they prefix all attributes with data- and become prone to collisions with other libraries (I've even had two of my own libraries collide!), or they make them extremely verbose (data-ng-*), or they make them non-standard (ng-*, v-*), which is their chosen solution. I'm about to release a library with a lot of attributes and I went for the latter as well. The former two pose serious practical problems, the latter is just conformance.

However, it doesn't have to be this way. Custom elements allow any element name with a hyphen in it, we could do the same for attributes. The cowpaths have been paved: Several very popular libraries follow this practice already. This is not true for proposals like #2250, which introduce a completely new naming scheme.

The main issue with this is all the existing attributes in SVG that come from CSS properties which use hyphens. However, there are several solutions to deal with this:

  • Exclude these prefixes, or just these names. The SVG working group is dying and these attributes must be manually added to the spec, there's no clause that says all CSS props must automatically be available as attributes.
  • Only allow prefixes of 1 or 2 letters. This gives us 26*26 + 25 = 701 more prefixes already, and does not clash with any CSS property that is available as an SVG attribute (z-index is the only CSS property that matches this, and it's not an SVG attribute). It also legalizes Angular & Vue's practices.

The more commonplace invalid HTML becomes, the less authors care about authoring valid HTML. Validation becomes pointless in their eyes if they see tons of perfectly good use cases being invalid. Also, if both attributes with and without hyphens are equally invalid, nothing forces developers to stick to any naming scheme. So, I think it would be great if we found a solution for this. And it's a proposal that requires zero effort from implementers, since these attributes already work just fine!

@Marat-Tanalin
Copy link

@Marat-Tanalin Marat-Tanalin commented Jan 17, 2017

I’d be fine with this as long as the pre-hyphen part could be empty, so attributes could have names like -foo, -bar, etc.

Otherwise this does not add much over the existing data- prefix (e. g. da- instead of data-) and is probably too problematic compared with the absolutely issue-free and future-proof underscore/hyphen-prefixed custom attributes.

To be fair, better than nothing anyway though.

Loading

@annevk
Copy link
Member

@annevk annevk commented Jan 18, 2017

I'm supportive of this, but only if we also add an API equivalent to what we added for custom elements. It should be possible for folks to easily observe when such attributes are added, removed, and change in value.

Loading

@zcorpan
Copy link
Member

@zcorpan zcorpan commented Jan 18, 2017

I’d be fine with this as long as the pre-hyphen part could be empty, so attributes could have names like -foo, -bar, etc.

Starting with a dash is not XML-compatible. Currently the spec requires data-* attribute names to be XML-compatible, and custom element names as well.

Loading

@LeaVerou
Copy link
Author

@LeaVerou LeaVerou commented Jan 18, 2017

Otherwise this does not add much over the existing data- prefix (e. g. da- instead of data-) and is probably too problematic compared with the absolutely issue-free and future-proof underscore/hyphen-prefixed custom attributes.

Clearly, you have not considered collisions between libraries and think everything can have the same prefix and the only problem is how to make the prefix less verbose. I don't blame you, I thought they were an edge case in the past as well, but they absolutely are not. With your proposal, libraries would end up doing things like _ng-*, or (most likely) simply not care and continue using ng-* like they've done for years.

Loading

@LeaVerou
Copy link
Author

@LeaVerou LeaVerou commented Jan 18, 2017

I'm supportive of this, but only if we also add an API equivalent to what we added for custom elements. It should be possible for folks to easily observe when such attributes are added, removed, and change in value.

That would be awesome. So basically, syntactic sugar for MutationObserver?

Loading

@annevk
Copy link
Member

@annevk annevk commented Jan 18, 2017

The problem with MutationObserver for this use case is that you don't know where the attribute is going to be added. So if you want a global custom attribute, you'd have to observe the entire tree and even then you'd miss certain things, such as shadow trees.

Loading

@Marat-Tanalin
Copy link

@Marat-Tanalin Marat-Tanalin commented Jan 18, 2017

@LeaVerou

With your proposal, libraries would end up doing things like _ng-*, or (most likely) simply not care and continue using ng-* like they've done for years.

_ was invalid at the moment of making decisions as for design of those libraries, that’s most likely why libraries’ authors have decided just to drop the (only valid at that moment) data- prefix and not to use a generic prefix that would be formally invalid anyway.

As a side note, it’d probably be wrong to assume that the fact that it’s hard for someone who is already a smoker (existing libraries in terms of custom attributes) to leave off smoking is a reason not to try to prevent others (new products and libraries) from starting smoking (provide a valid short unobtrusive generic prefix).

Loading

@LeaVerou
Copy link
Author

@LeaVerou LeaVerou commented Jan 18, 2017

The problem with MutationObserver for this use case is that you don't know where the attribute is going to be added. So if you want a global custom attribute, you'd have to observe the entire tree and even then you'd miss certain things, such as shadow trees.

True, and what you're proposing would solve a HUGE problem and I would cry tears of joy once it gets implemented! I'm just a bit concerned that it requires considerably more implementor effort, so adding it could stall. Whereas just permitting such attribute names at first would let us use them and it's a super easy addition to the spec since it requires no implementation effort.

Loading

@annevk
Copy link
Member

@annevk annevk commented Jan 19, 2017

Fair, I think there is interest to go in this direction once custom elements has shipped. This idea was briefly discussed at the last W3C TPAC. I think the main thing we lack is someone freeing up the time to write the standard. @domenic thoughts?

Loading

@rniwa
Copy link
Collaborator

@rniwa rniwa commented Jan 19, 2017

I think the fact custom elements kind of encourage people to add a random attribute is a serious issue already so coming up with a some convention for author-defined attribute is a win even if we couldn't add an API for custom attributes yet.

Having said that, we think custom attribute is a much better alternative to is attribute.

Loading

@zcorpan
Copy link
Member

@zcorpan zcorpan commented Jan 19, 2017

Pages in httparchive with attributes that start with _ or non-standard attributes containing -:

SELECT * FROM (
SELECT page, url, REGEXP_EXTRACT(LOWER(body), r'(<[a-z][a-z0-9-]*\s+(?:(?:data-|aria-|http-|accept-)?[a-z]+(?:\s*=\s*(?:"[^"]*"|\'[^\']*\'|[^>\s/"\']+\s+)|\s+))*(?:_[a-z]|(?:[b-ce-gj-z]|d[b-z0-9]|a[a-bd-qs-z0-9]|h[a-su-z0-9]|da[a-su-z0-9]|ar[a-hj-z0-9]|ac[a-bd-z0-9]|ht[a-su-z0-9])[a-z0-9]*-)[^>\s]*\s*=[^>]*>)') AS match
FROM [httparchive:har.2017_01_01_chrome_requests_bodies]
)
WHERE page = url
AND match != "null"
AND NOT REGEXP_MATCH(match, r'["\']\s*\+') # exclude JS string concats
AND NOT REGEXP_MATCH(match, r'<(altglyph|animate|circle|clippath|color-profile|cursor|defs|desc|ellipse|feblend|fecolor|fediffuse|fedisplacement|fedistant|feflood|fefunc|fegauss|feimage|femerge|femorph|feoffset|fepoint|fespec|fespot|fetile|feturb|filter|font|foreign|g\s|glyph|hkern|image|line|marker|mask|metadata|missing|mpath|path|pattern|polygon|polyline|radial|rect|set\s|stop|svg|switch|symbol|text\s|textpath|tref|tspan|use\s|view\s|vkern)') # exclude SVG elements

4068 results: https://gist.github.com/zcorpan/b54592e415a2f79f2ef7f79c0c37b2ed

Of those:

  • 26 have _moz_*
  • 531 have an attribute starting with _ (excluding moz prefix).
  • 22 have x-webkit-* or x-ms-* (the HTML spec for a while recommended vendor extensions to be prefixed with x-vendor-).
  • 57 start with x- (excluding webkit/ms prefixes).
  • 2015 have a prefix of 1 or 2 letters and a dash (excluding x-).
  • 1418 have 3+ letters before the dash.

Other things to note:

  • SVG font-face had x-height and v-alphabetic (etc) attributes. But this element is dead.
  • The HTML attributes with dash are aria-*, data-*, accept-charset, http-equiv.
  • I found an instance of typo of aria -- it would be good if conformance checkers could continue to catch this mistake:
    <button area-invalid="true" aria-required="true" aria-controls="checkincontainer" aria-label="checkin" id="checkinbutton" class="checkinbutton">
    

Loading

@zcorpan
Copy link
Member

@zcorpan zcorpan commented Jan 19, 2017

For comparison, equivalent query for data-* gives 59,755 results. So data-* is about 15 times more common than non-standard custom attributes (excluding _moz_, x-webkit-, x-ms-).

Loading

@zcorpan
Copy link
Member

@zcorpan zcorpan commented Jan 19, 2017

So @LeaVerou's proposal is used by ~0.4% of pages in httparchive; @Marat-Tanalin's proposal is used by ~0.1%. data-* is used by ~12.1%. (The data set is 494,956 pages.)

Since the point here is to adopt what people like or use anyway, if we are to do this, it seems most reasonable to me to allow both. But we should disallow _moz_, x-webkit- and x-ms- and 3+ letter prefix followed by dash (to avoid clashes in SVG, and to make it possible to tell if an attribute is a "custom attribute" or not, and to catch typos in aria- or data-), as well as anything not XML-compatible. But no need to restrict the prefix to [a-z], I believe (data-* and custom element names allow other XML-compatible characters).

Loading

@domenic
Copy link
Member

@domenic domenic commented Jan 19, 2017

I still feel there's a strong advantage to sticking to a single sanctioned convention (data-) for custom data attributes, at least until we have a processing model for the "custom attributes".

If people want to go against that convention, that's their choice, but we shouldn't give them a free pass; they're making a conscious choice to trade conformance and ecosystem compatibility for convenience.

Loading

@Marat-Tanalin
Copy link

@Marat-Tanalin Marat-Tanalin commented Jan 19, 2017

@domenic Sorry, but that’s just a purely theoretical statement totally detached from reality.

As a practicing web developer, I’m quite happy with what we already have currently feature-wise: getAttribute() / setAttribute() / removeAttribute() in JS and [attribute] in selectors.

The only issue here is the artificial validity limitation that could and should be easily removed on spec level. Having (or not) a processing model for custom attributes does not affect the ability to use such attributes right now (to be clear: I’m specifically about _-prefixed attributes that are 100% future-proof).

Loading

@LeaVerou
Copy link
Author

@LeaVerou LeaVerou commented Jan 19, 2017

Thanks for the data @zcorpan!! Very enlightening. I find it surprising that Angular and Vue would only be used by 0.4% of websites. Perhaps a lot of these attributes are added dynamically? Also, I'm not surprised that data- has such as high percentage: Small libraries that only add 1 or 2 attributes can easily use data- and be less worried about either collisions or verbosity. It only takes 1 such library for a page to qualify as having a data- attribute.

It's also an interesting idea to allow both proposals. I don't see any problem with that, flexibility is good!

@domenic Several people have commented about the problems with data-. Developers of popular libraries with many attributes are not using data-. Even those that supported both their own prefix- and a data-prefix- version of each attribute are dropping the latter because nobody is using it, probably because data-prefix- is a verbose abomination. And you resist legalizing anything other than data- because of some theoretical purity argument about "a single sanctioned convention"? What happened to the priority of constituencies? Doesn't author convenience come several levels before theoretical purity?!

Loading

@domenic
Copy link
Member

@domenic domenic commented Jan 19, 2017

Several people have commented about the problems with data-. Developers of popular libraries with many attributes are not using data-. Even those that supported both their own prefix- and a data-prefix- version of each attribute are dropping the latter because nobody is using it, probably because data-prefix- is a verbose abomination.

This argument (and I would appreciate if you avoided phrases like "abomination" in reasoned discussion) is based on anecdotes, whereas @zcorpan shows soundly with data that it does not hold in the real world. A small minority of developers using custom attributes are unhappy with data; 15x more are happy with data than are unhappy. They can be vocal, as you are, but saying that this is a widespread problem is just not supported.

And you resist legalizing anything other than data- because of some theoretical purity argument about "a single sanctioned convention"? What happened to the priority of constituencies? Doesn't author convenience come several levels before theoretical purity?!

Sorry, but that’s just a purely theoretical statement totally detached from reality.

I don't think it's helpful or accurate to characterize the argument as one of theoretical purity, or start invoking the priority of constituencies before any such violation is apparent. This is about the practical impact of fracturing the ecosystem into multiple conventions for custom data. That has real impact on tooling, libraries, authors reading other authors' source code, API consistency and predictability (why do some data properties get a dataset API, and others don't?) and much more.


Again, I repeat that there is nothing stopping you from making a conscious choice between conformance and brevity. If you value brevity so much as to start calling data- attributes an abomination, I presume you value it more than conformant documents. That's fine! You can make that choice! As you yourself have noted, there's nothing stopping you. But it doesn't mean the spec should stop trying to keep the ecosystem coherent to the best of its abilities.

Loading

@Marat-Tanalin
Copy link

@Marat-Tanalin Marat-Tanalin commented Jan 19, 2017

@domenic

15x more are happy with data than are unhappy.

The obvious reason of prevalence of data--prefixed attributes over other prefixes is that data- is the only formally valid option for now. This has nothing to do with whether people are actually happy with it.

Good web developers just usually prefer to keep their documents valid, and not just because that makes them “feel good”, but also to be able to use validators to easier see real errors not intermixed with fictious pseudoerrors related to artificial spec-level limitations not matching reality.

why do some data properties get a dataset API, and others don't?

Because not all custom attributes are data attributes. data- attributes are for data, custom-prefixed attributes are for custom needs whatever those are.

Loading

@stevefaulkner
Copy link
Contributor

@stevefaulkner stevefaulkner commented Jan 19, 2017

@zcorpan wrote:

So @LeaVerou's proposal is used by ~0.4% of pages in httparchive; @Marat-Tanalin's proposal is used by ~0.1%. data-* is used by ~12.1%. (The data set is 494,956 pages.)

does that mean that some other form of prefix is used by the other 87%?

Loading

@domenic
Copy link
Member

@domenic domenic commented Jan 19, 2017

The obvious reason of prevalence of data--prefixed attributes over other prefixes is that data- is the only formally valid option for now. This has nothing to do with whether people are actually happy with it.

That's an interesting speculation. Fortunately, it's also one we can answer, or at least upper-bound, with data. That is, what percentage of those ~12.1% of pages are conformant? In other words, what percentage of people using data-* attributes are also people who care about conformance, and thus might have chosen data- over x- because of conformance concerns?

Similarly, what percentage of the ~0.5% using nonstandard prefixes are conformant-except-for-bad-prefixes? This number is especially interesting, because it indicates people who are interested in conformance but just aren't willing to change their prefixes. Certainly you and Lea might fall in that sub-bucket of the ~0.5%. (Although maybe not?) But how many of that ~0.5% are you representing?


Another point worth making is the analogy to a previous push to use <i> for icons. The reasoning was exactly the same: lots of people are doing it, because it's shorter than the recommendation in the spec (<span> with fallback text). We even did a HTTP archive search, and found that many more developers would "benefit" from allowing this than the fraction-of-~0.5% being discussed here. But allowing <i> for icons has many practical downsides---the same ones I listed before for allowing non-data- prefixes for custom data attributes. For that reason, we didn't do it.


does that mean that some other form of prefix is used by the other 87%?

I assume it means they are not using any prefixed attributes (data- or otherwise) at all.

Loading

@domenic
Copy link
Member

@domenic domenic commented Jan 19, 2017

Let me also repeat that I do support exploring the concept of custom attributes, with a processing model similar to custom elements. That gives serious benefits beyond just brevity, that IMO outweigh the practical disadvantages. It's the simple conformance change with no processing model that I am not in support of.

Loading

@LeaVerou
Copy link
Author

@LeaVerou LeaVerou commented Jan 19, 2017

Again, I repeat that there is nothing stopping you from making a conscious choice between conformance and brevity. If you value brevity so much as to start calling data- attributes an abomination, I presume you value it more than conformant documents. That's fine! You can make that choice! As you yourself have noted, there's nothing stopping you. But it doesn't mean the spec should stop trying to keep the ecosystem coherent to the best of its abilities.
@Marat-Tanalin

Authors don't typically invent their own attributes, and when they do, data- is fine. Most custom attributes are used because a library/framework will utilize them. Therefore, the person using the attribute is not the same person that decided on its naming. It's not about my choice, it's about making the right choice for the users of my library. I don't want to impose verbosity on them and litter their markup with lengthy prefixes, and I don't want to impose nonconformance on them. Library devs should not be forced into this dilemma.

Re: fracturing the ecosystem, how does that not apply to custom element names?

whereas @zcorpan shows soundly with data that it does not hold in the real world

While I definitely commend the effort to get real data, I would take that percentage with a grain of salt:

  1. We're basically parsing HTML with regexes here
  2. None of this accounts for dynamically added attributes.
  3. It counts occurrence of each naming scheme per page, whereas I suspect that when ng- or v- attributes are used, A LOT of them are used.
  4. As I mentioned above, smaller libraries can use data- just fine. When you only have one or two attributes, the verbosity doesn't matter much and the collisions are more rare. It only takes 1 such library for a page to count in @zcorpan's data.
  5. As @Marat-Tanalin mentioned, data- is the only conformant option right now, don't you think that affects usage?
  6. These stats go against common knowledge: Angular and Vue are very popular, it seems weird that they'd be collectively used by only 0.4% of websites.

Fortunately, it's also one we can answer with data. That is, what percentage of those ~12.1% of pages are conformant? In other words, what percentage of people using data-* attributes are also people who care about conformance, and thus might have chosen data- over x- because of conformance concerns?

You're assuming here that everybody who cares about conformance is actually conformant. A parallel about religions and sins comes to mind. :) Many authors care about conformance, but don't actually validate, so they make mistakes that are never caught. However, conformance still influences their decision making.

Loading

@LeaVerou
Copy link
Author

@LeaVerou LeaVerou commented Jan 19, 2017

Let me also repeat that I do support exploring the concept of custom attributes, with a processing model similar to custom elements. That gives serious benefits beyond just brevity, that IMO outweigh the practical disadvantages. It's the simple conformance change with no processing model that I am not in support of.

Nobody is against that. As I said above, that would be incredible! It would make my life so much easier. What I was suggesting is making the conformance change first, since it's easy, and adding the (harder to design) API as a later step, once it gets implementor interest and a spec editor willing to do it.

Loading

@Marat-Tanalin
Copy link

@Marat-Tanalin Marat-Tanalin commented Jan 19, 2017

@domenic

(Although maybe not?)

I would appreciate if you avoided further trolling.

FYI, unlike what you’ve probably naively assumed, I am aware the pubdate attribute is currently not in the HTML spec, so using the formally invalid attribute is not accidental. I use the attribute intentionally since it was previously specced and perfectly valid, but then has been removed on a purely theoretical basis by someone who unfortunately has a sort of overformal logical approach (but who is still able to be respectful and deserves to be respected) somewhat similar to yours, and recommended to use the bolted-on verbose pseudosemantic surrogate called Microdata instead. (Btw, the same person also tried to remove the TIME element in favor of a new cool universal element called… DATA, but fortunately failed thanks to massive web-developers’ objections.) Violating the current version of the HTML spec by continuing to use the pubdate attribute solely on my own site is a sort of my conscious and consistent objection to that (wrong in my opinion) decision. Moreover, according to my experience, at least Google search engine does support the attribute regardless of that it has been removed from the spec, so its use still makes sense in practice.

Loading

@Marat-Tanalin
Copy link

@Marat-Tanalin Marat-Tanalin commented Jan 19, 2017

Another point worth making is the analogy to a previous push to use <i> for icons.

Any analogy suffers from inaccuracies, is not a proof or an argument of any kind, and is often actually just irrelevant offtopic noise.

Loading

@Hixie
Copy link
Member

@Hixie Hixie commented Jan 19, 2017

Let's please remain focused on the technical issues.

Loading

@LeaVerou
Copy link
Author

@LeaVerou LeaVerou commented Dec 6, 2017

Another use case: Web Components that degrade gracefully.

For example, take a look at this carousel component

It’s used like this:

<skeleton-carousel dots nav loop>
  <iron-image placeholder="https://source.unsplash.com/category/nature/10x10"
              data-src="https://source.unsplash.com/category/nature/500x300"
              sizing="cover"
              preload
              fade
              ></iron-image>
  <iron-image placeholder="https://source.unsplash.com/category/food/10x10"
              data-src="https://source.unsplash.com/category/food/500x300"
              sizing="cover"
              preload
              fade
              ></iron-image>
  <iron-image placeholder="https://source.unsplash.com/category/buildings/10x10"
              data-src="https://source.unsplash.com/category/buildings/500x300"
              sizing="cover"
              preload
              fade
              ></iron-image>
</skeleton-carousel>

Wouldn't it be great if its content was proper <img> tags, so that something reasonable is visible in older browsers?
But if you do that, then the attributes would have to be data- prefixed with no indication of which attributes belong to the component and which ones don't.

Loading

@chaals
Copy link

@chaals chaals commented Apr 25, 2018

@LeaVerou

Wouldn't it be great if its content was proper tags, so that something reasonable is visible in older browsers?
But if you do that, then the attributes would have to be data- prefixed with no indication of which attributes belong to the component and which ones don't.

What am I missing? If you customised the img element, e.g.

<img is="iron-image" src="some.img" alt="what?"...>

You would use normal attributes where they existed, no? Isn't it only an issue where you are making up something completely new anyway?

Loading

@effulgentsia
Copy link

@effulgentsia effulgentsia commented Apr 25, 2018

@chaals: I think the issue is that the current custom elements spec says:

Customized built-in elements follow the normal requirements for attributes, based on the elements they extend. To add custom attribute-based behavior, use data-* attributes.

So if you write a component as <img is="iron-image", then all attributes defined by the iron-image component need to be data- prefixed. But data-* attributes are also used by whatever other scripts (unrelated to the iron-image component) might be interacting with the page that contains an <img is="iron-image" element. Hence, @LeaVerou's observation that:

no indication of which [data-*] attributes belong to the component and which ones don't

Loading

@LeaVerou
Copy link
Author

@LeaVerou LeaVerou commented Apr 28, 2018

@chaals They are not customizing the <img> element because they don't want data- prefixed attributes (and I don't blame them). They use custom elements just so they can use shorter attribute names, so there is no fallback.

Loading

@strongholdmedia
Copy link

@strongholdmedia strongholdmedia commented Apr 30, 2018

But if you do that, then the attributes would have to be data- prefixed with no indication of which attributes belong to the component and which ones don't.

Following your logic, those attributes that display "something reasonable" in "older" browsers -or affect the appearance - do belong to the "component", while the others don't.

In my opinion, one should not use the markup layer for state and unpredictable side effects at all, for that is violation of the single responsibility principle. But this is exactly what Angular or Vue does.

I believe that people should not at all use something like your skeleton-carousel in the markup as well. For these types of things, there is - was and will be - XML/XSLT always, should anybody find the need.

After all, what could the benefits be of "knowing" what "attributes" do, according to the designer's own logic, belong to the component, if one could not reasonably deduce what attributes will actually affect the rendering in any conformant and well-specified client?
Does anyone really want to reduce the concept of well-formedness to having an even number of quotation marks or inequality marks?

I think, after following this discourse for a while, that perhaps the idea of a specific layer, just like XML/XSLT but maybe distinct, being promoted towards those people who are obsessed with this component-oriented thing that is, in my opinion, somewhat distinct and distant from the concept of DOM and what it was conceived for, that also conveys the abandon hope all ye who enter here type of note in and of itself for others; and that HTML be left to those who actually prefer documentation over convention of people with random mindsets that they may, at times, consider counter-intuitive or even marginal, is possibly better for both worlds.

Loading

alice added a commit to alice/html that referenced this issue Jan 8, 2019
In whatwg#2271 it is argued that
data-* attributes are not necessarily "data", any more than
standard attributes are "data". Clarify intent and add a new
example where a data-* attribute is used as a boolean attribute.
@Jamesernator
Copy link

@Jamesernator Jamesernator commented May 31, 2019

I don't see a it being likely people will use data- attributes for custom elements, I don't think I've ever even seen an example of custom elements that uses them (even the HTML spec does not), there's no encouragement from any existing solutions and no push from custom element authors to use data--prefixed attributes.

I think non-conforming names is web reality already anyway, a decent number of sites are already using these non-conforming attributes (and even ones without hyphens). Regardless of whether the WHATWG agrees to change the spec new global attributes will still need to be checked for web compatibility.

Loading

@strongholdmedia

This comment was marked as off-topic.

@zcorpan
Copy link
Member

@zcorpan zcorpan commented Jun 3, 2019

Please don't add off-topic comments. This issue is about custom attributes.

To move this issue forward, a good step would be to ask implementers if there's interest in an API for observing changes to custom attributes as annevk suggested.

Loading

@nuxodin
Copy link

@nuxodin nuxodin commented Jun 11, 2020

Since the css function attr() will be usable with all attributes, it might be a good time to think about a specification for custom attributes.
https://www.w3.org/TR/css3-values/#attr-notation

Loading

@Malvoz
Copy link
Contributor

@Malvoz Malvoz commented Oct 20, 2020

Since the css function attr() will be usable with all attributes

attr() may be limited to a subset of prefixed attributes, see w3c/csswg-drafts#5136.

Loading

@JoshuaWise
Copy link

@JoshuaWise JoshuaWise commented Dec 14, 2020

I'm currently in the process of writing a framework built on Web Components (custom elements), and I'm having a very hard time figuring out how to handle the name-spacing of attributes. The way I see it, there are 3 different agents who may want to define attributes on custom elements:

  • The consumer of the custom element. The intent may be to mark the element for querySelector() purposes or for CSS.
  • The author of the custom element. The intent may be to provide an interface with the consumer for receiving initial state or reflecting the element's state.
  • The user-agent (browser), which will inevitably define new global attributes in the future, for arbitrary purposes.

The first group of people (consumers of the element) can simply use data-* attributes, which are reserved by the spec for this purpose.

The third group of people (browsers) tend to define attributes that are single lowercased words, but I'm not confident that I can rely on that assumption.

The second group of people (authors of custom elements) seemingly have no good solution. They can't use data-* attributes because those are reserved for the consumers of the element. And without some guarantees about the naming of future global attributes, they have no way of protecting themselves against future name collisions.

As a software engineer, the obvious solution to me is namespaces. If we can't use colon (:) namespaces due to XML compatibility, then hyphen (-) namespaces seem perfectly fine. Each independent agent can define their own namespace to work in. The data- namespace is for the website author. The "empty" namespace (no hyphen) is for browsers. And every other namespace (except aria-, I guess) is for everybody in-between.

Loading

@strongholdmedia
Copy link

@strongholdmedia strongholdmedia commented Dec 14, 2020

As a software engineer, the obvious solution to me is namespaces. If we can't use colon (:) namespaces due to XML compatibility, then hyphen (-) namespaces seem perfectly fine. Each independent agent can define their own namespace to work in. The data- namespace is for the website author. The "empty" namespace (no hyphen) is for browsers. And every other namespace (except aria-, I guess) is for everybody in-between.

You, sir, as your name suggests, are indeed very wise.
I also insisted that something similar be made / kept, but ran into the some actors doing what-when-ever they deem feasible attitude that turned out persistent, and thus gave up.

Loading

@enkelmedia
Copy link

@enkelmedia enkelmedia commented Jun 8, 2021

This really needs attention, I don't understand why the standard is enforcing things that makes cross-browser functional code invalid "HTML" so that we have to either stop to care about the standard or over and over explain to customers that the standard is behind reality. Leaving them worries without no real reson.

It's time to get up to speed with how things are actually used and update the standards - otherwise the relevance of the standard will decrease and become something that people see as "something from the past".

There is plenty of good ideas from 3-4 years ago - why is this stale?

Loading

@strongholdmedia

This comment was marked as off-topic.

@zcorpan
Copy link
Member

@zcorpan zcorpan commented Jun 9, 2021

@strongholdmedia Please do not derail the discussion with issues that have nothing to do with the topic at hand.

@enkelmedia my previous comment suggests a next step for this issue.

Loading

@RouninMedia
Copy link

@RouninMedia RouninMedia commented Jul 16, 2021

Amongst others, @JoshuaWise's comments (2020-12-14) suggest a clear outline for a practical, useful and consistent approach moving forward:

  • non-hyphenated attributes: standard, attributes introduced by Spec Authors
  • hyphenated attributes: custom attributes introduced by Custom Element / WebComponent Authors
  • data-attributes: custom attributes applied by Consumers of standard & custom elements

This leaves Library / Framework / WebComponent Authors (the middle group) needing to take note of a couple of well-known, reserved hyphenated prefixes - eg. don't use the prefix http- (because it already exists in http-equiv) and don't use data- or aria- - but otherwise Library / Framework / WebComponent Authors retain a free hand to build their own hyphenated custom attribute names, constrained only by the same requirements which apply to custom element names.

This means both ng- and v- can be welcomed (at last) as valid custom attribute prefixes.

Arguably, the most significant issue to resolve remains what to do about SVG (as @LeaVerou mentioned at the very beginning) since standard attribute names are frequently hyphenated in SVG. This threatens a worst case scenario of many name collisions between standard (hyphenated) SVG attribute names and custom (hyphenated) attribute names: not only in the present but (worse) in the future.

Perhaps here is where the leading underscore can come into play? A leading underscore which the SVG parser always takes note of but which remains optional in HTML, because the HTML parser always ignores it? (In the same way that the HTML5 parser ignores any XHTML-style trailing slash in self-closing elements).

Thus, in HTML:

  • enable-background
  • _enable-background

are functionally identical and in practice - or most of the time, at least - only the former will ever tend to be used.

Whereas, in SVG:

  • enable-background
  • _enable-background

the former is parsed as a specced standard attribute, while the latter may be immediately recognised (by developers and user-agents) as a custom attribute.

Is that too confusing? To have _enable-background mean the same thing as enable-background in HTML, but for the two names to mean two different things in SVG? There certainly is a precedent for syntax not always meaning the same thing in HTML and SVG - not least in that SVG is case-sensitive, while HTML is case-insensitive.

Advice to custom element authors would be:

  • if you wish to, you can, in every context, always prefix your hyphenated custom attributes with an underscore
  • though, for all practical purposes it makes no difference whether you do or not in HTML
  • however in SVG, it absolutely does make a difference, so when writing SVG, be sure to always prefix with an underscore

Loading

@zcorpan
Copy link
Member

@zcorpan zcorpan commented Aug 16, 2021

We can't ignore a leading _ in attribute names in HTML, that would likely break content that uses it and expects the underscore to not be ignored.

Loading

@RouninMedia
Copy link

@RouninMedia RouninMedia commented Aug 16, 2021

Three (genuine) questions in response:

  1. Are there already standard attribute names in HTML which begin with a _ ?

  2. Are there any frameworks / libraries / environments where a pair of distinct custom attributes exist which have identical names, save for the fact that one begins with a leading underscore and the other does not?

  3. Are there any frameworks / libraries / environments which introduce (or allow for) a custom attribute which has an identical name to an already-existing standard attribute, save for the fact that it begins with a leading underscore?

Loading

@zcorpan
Copy link
Member

@zcorpan zcorpan commented Aug 16, 2021

  1. No
  2. It seems unlikely, but I don't know.
  3. There are such instances in https://gist.github.com/zcorpan/b54592e415a2f79f2ef7f79c0c37b2ed e.g. <img _src=...>

Last time I looked at non-standard attributes in HTTP Archive (see #2271 (comment) ), there were 531 instances with a leading _ excluding _moz_. Those pages might use those attributes from JS or CSS and therefore rely on the _ not being ignored (e.g. removed by the HTML parser).

Loading

@RouninMedia
Copy link

@RouninMedia RouninMedia commented Aug 16, 2021

Many thanks for that clarification, @zcorpan.

Yes, I concede: we can't make a leading _ character ostensibly superfluous in custom HTML attributes if attributes such as _src are already in use alongside src.

Not wishing to sound absurd, but if a single _ as an arbitrary prefix is out of the question, then what about a double __?

After all, in this suggestion, the HTML-optional / SVG-obligatory underscore(s) aren't being introduced as prefixes for the benefit of the HTML parser - the HTML parser is already capable of recognising that any attribute which includes two hyphenated words (of which the first isn't aria-, http- etc.) must be a Custom Attribute.

The purpose of introducing HTML-optional / SVG-obligatory underscores is so the SVG parser may immediately distinguish between regular attributes with hyphens and Custom Attributes which (also obligatorily) include hyphens.

That is:

  1. a hyphen is enough of a distinguishing feature in HTML to indicate that the attribute is a Custom Attribute (subject to not using a small handful of reserved hyphenated prefixes)

  2. a hyphen is an insufficiently distinguishing feature in SVG, so another feature - in this case a double underscore prefix - is utilised

  3. the HTML parser knows to ignore the double underscore prefix when it sees it, since this is an SVG convention and instead will only look for whether the attribute name is hyphenated or not

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Linked pull requests

Successfully merging a pull request may close this issue.

None yet