Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML parser changes for customizable <select> #10310

Open
josepharhar opened this issue Apr 30, 2024 · 42 comments · May be fixed by #10557
Open

HTML parser changes for customizable <select> #10310

josepharhar opened this issue Apr 30, 2024 · 42 comments · May be fixed by #10557
Labels
addition/proposal New features or enhancements topic: forms topic: parser topic: select The <select> element

Comments

@josepharhar
Copy link
Contributor

josepharhar commented Apr 30, 2024

What is the issue with the HTML Standard?

In order to support the stylable select aka appearance:base-select proposal, we need to make changes to the HTML parser to allow more tags inside <select>, because the current HTML parser basically throws away all tags besides <option>, <optgroup>, and <hr>.

Here are options for how we can extend the HTML parser with varying levels of web compatibility:

  1. Allow any tags within <select>
  2. Allow <button> and <datalist> tags in <select>, and allow any tags within <button> and <datalist>
  3. Allow any tags within a <select>’s <option>, <button>, or <datalist>

1. Allow any tags within <select>

Allowing any tags within <select> would be good because it is more flexible for developers since they won’t necessarily have to add a <datalist>, but it is the most breaking change of the options I listed. I added a use counter for tags which get dropped in select mode, and it is at 0.3%, which is quite high. I also looked at dozens of the websites with an experimental patch to allow any tags, and while most of them seemed unaffected by allowing anything, some of them were broken:

  • http://tx.7ma.cn/
    This website has a <select> tag without a closing </select> tag, and it causes a bunch of the HTML to get put inside the <select> instead of being rendered/parsed after the <select>.
  • http://top.ge/
    This website has a <select> with additional tags inside the <option>s, but it doesn’t actually use the <select> element to render and instead has some other custom thing which appears to be reading out the DOM contents of the <select>. The dropdown in the website now has a bunch of newlines and odd content in the options.

There’s good reason to believe other sites will be affected in a similar way.

2. Allow <button> and <datalist> tags in <select>

A more web-compatible option would be to make the parser allow <button> and <datalist> in <select>, and then make the parser allow anything inside <button>/<datalist>. These tags correspond to the two visual parts of the <select> as per the explainer. I have use counters for <button> and <datalist> tags inside <select> here, and they have very low usage (0.001% and 0.0001% respectively):
https://chromestatus.com/metrics/feature/timeline/popularity/4771
https://chromestatus.com/metrics/feature/timeline/popularity/4772

These usage numbers are low enough that we would be willing to ship in chrome.

3. Allow any tags within a <select>’s <option>, <button>, or <datalist>

This is like the first option, but doesn’t allow other tags in between <option>s. Based on the sites which broke, I don’t think this would be significantly more compatible than just allowing all tags. I also think this might be confusing to developers when arbitrary content can be added inside options but not in between them unless they are all wrapped by a datalist tag.

I also looked through the commit logs of both the webkit/chromium implementation and spec (and here) in order to find out what the justification was for dropping tags inside <select>, and I didn’t find out anything useful. The implementation had minimal context here. When going through the commit log of the HTML spec, I got back to the initial commit of the git repo, which didn’t explain either.


My preference is option 2 because it has the lowest risk of breaking websites, but I have gotten feedback that requiring developers to write <datalist> is not great. Nevertheless, given this compat data I think that’s our best option for moving forward.

@josepharhar josepharhar added the agenda+ To be discussed at a triage meeting label Apr 30, 2024
@aardrian
Copy link

aardrian commented May 1, 2024

When you say "any tags" do you genuinely believe any element can and should be allowed in an <option> and/or <select>? As in, would this construct be legal?

<select>
 <main>
  <h1>sandwiches</h1>
  <ul>
   <li><option>Meat</option></li>
   <li><option>Not Meat</option></li>
  </ul>
 </main>
</select>

I know that only addresses your option 1, but pretend I am doing similar for options 2 and 3.

And if you do mean that, how do you propose exposing that through accessibility APIs? Using my example, if I want to navigate by landmarks, is the main region hidden until I expand the <select>? What happens when the <select> has focus and I just want to arrow through options using only my keyboard?

The scope of this suggestion feels more like using a howitzer to swat a fly (that fly being devs who fail to close their select menus). Or at least I am not understanding the benefit for users as outlined above.

@josepharhar
Copy link
Contributor Author

When you say "any tags" do you genuinely believe any element can and should be allowed in an <option> and/or <select>? As in, would this construct be legal?

We discussed this in OpenUI here, and the conclusion as I understand it was to allow anything to be parsed/rendered like normal content, but emit console messages showing why it is not correct for accessibility: openui/open-ui#540

@scottaohara do you have any additional thoughts on this topic?

@aardrian
Copy link

aardrian commented May 1, 2024

That's a lot of reading, but this appears to be the resolution:

RESOLVED: Allow interactive content outside of <options>, but issue strong console errors or warnings in this case.

Which, ok, the console will tell devs to maybe not do that. However, you are making an issue to allow it. And your take on that resolution is:

the conclusion as I understand it was to allow anything to be parsed/rendered like normal content, but emit console messages showing why it is not correct for accessibility

Given that...

  • In your words, what is "normal content"?
  • Using my example above, how do you envision that working for users?

These aren't questions for Scott as he didn't file this issue. I would like to know your take.

@annevk
Copy link
Member

annevk commented May 2, 2024

The parser changes themselves don't have a11y impact. Web developers can already construct arbitrary node trees using API calls and browsers already have to handle those. This is purely about what trees can be constructed using syntax instead of API calls.

Now there is a separate question as to what the new content model of the select element is going to be and there we should discuss any new kind of a11y mapping that goes with that and how to determine the fallback for platforms that won't enable the richer rendering, etc.

Let's not conflate them however.

@zcorpan
Copy link
Member

zcorpan commented May 2, 2024

cc @whatwg/html-parser

@zcorpan
Copy link
Member

zcorpan commented May 2, 2024

I agree that option 2 seems most workable and easier to understand than option 3.

I'm guessing that the content model (i.e. what is allowed for authors to use, not what does the parser do) for button can stay as-is? The content model around datalist, option and optgroup probably needs changing to allow other elements in between.

@aardrian
Copy link

aardrian commented May 2, 2024

The parser changes themselves don't have a11y impact.

I think you are conflating the accessibility tree with tangible accessibility impact on users. For example, a keyboard user is not impacted by how my construct is represented in the accessibility tree -- they still need to open the select menu to get to the content. Users of Windows High Contrast Mode / Contrast Themes could be impacted based on how this content is mapped to native controls when it assigns user-chosen colors.

Authors will absolutely do things like nest interactive controls, conflicting roles, and more. So whether the changes impact the accessibility tree or the content model feels dismissive of my broader question.

So forgetting the accessibility tree, what if my construct is:

<select>
 <main>
  <h1>sandwiches</h1>
  <ul>
   <li><option>Meat</option></li>
   <li><option>Not Meat</option></li>
  </ul>
   <select>
    <button>Do</button>
    <option>This</option>
   </select>
 </main>
</select>

Is the idea that in that scenario a keyboard user has to expand the <select> to get to the other controls? How do arrow keys work in that case? What about when the user needs to scroll to see all the content? Etc.

Yeah, that's a very specific example, but this proposal is making a very broad suggestion without any equally broad statements how things like keyboard navigation would work.

Also:

My preference is option 2 because it has the lowest risk of breaking websites, but I have gotten feedback that requiring developers to write is not great.

Is this feedback a function of lack of existing support for <datalist> (because browser makers can fix that)? Is this feedback anchored on anything specific?

@annevk
Copy link
Member

annevk commented May 2, 2024

All I'm saying is that this is not the best issue to discuss that broader question. I also wasn't talking about the accessibility tree.

I'm saying that you can create such a construct today. Example: https://software.hixie.ch/utilities/js/live-dom-viewer/saved/12696. The result is a select with no options. This issue doesn't propose changing any of that. That requires bigger changes we'll indeed need to discuss in depth.

@aardrian
Copy link

aardrian commented May 2, 2024

All I'm saying is that this is not the best issue to discuss that broader question.

Fair enough. I don't see an issue for discussing the accessibility impacts I raised, though. @josepharhar have you filed that and I can't find it?

Also, still curious on the <datalist> justification you cited.

@josepharhar
Copy link
Contributor Author

I don't see an issue for discussing the accessibility impacts I raised, though. @josepharhar have you filed that and I can't find it?

Looks like scott filed one here: #10317

Also, still curious on the datalist justification you cited.

I wrote that "I have gotten feedback that requiring developers to write datalist is not great" because developers said that it would be better if they could use the new parser mode without needing to also write <datalist>. It's not great to require <datalist> to opt in to new parsing behavior, but not a huge deal to have to write it.

@aardrian
Copy link

aardrian commented May 2, 2024

Looks like scott filed one here: #10317

Indeed he did. And I very much appreciate that Scott did so. But I note it was not filed alongside this issue and he was the one who stepped in to file it after I asked.

I wrote that "I have gotten feedback that requiring developers to write datalist is not great" because developers said…

Again, what developers? If it was a couple friends of yours, that's fine because it's still feedback but it also doesn't carry a lot of weight. But if it was hundreds of devs as a function of a survey as part of discovery for filing this issue, that feedback carries a lot more weight IMO.

@past past removed the agenda+ To be discussed at a triage meeting label May 2, 2024
@annevk
Copy link
Member

annevk commented May 3, 2024

It seems somewhat self-evident that requiring <datalist> is not great? If you could leave it out and get the same results how would that not be better?

@zcorpan
Copy link
Member

zcorpan commented May 3, 2024

Existing sites might have other tags in option and expect the current behavior. The parser currently drops most tags but has special handling for <select>, <input>, <keygen>, <textarea>, and for "in select in table" also special handling of table-related tags.

Maybe we could keep those special-cases and accept other tags in option? Maybe even in select? Keeping the "break out of select" tags as-is seems especially relevant for web compat.

@zcorpan
Copy link
Member

zcorpan commented May 3, 2024

https://ablefast.com/ has <div><select></div><option>. If we go with a variant of option 1, this should continue to work.

<div class="input-field">
<label for>Past Pools Results:</label>
<select class name="date" onChange="if (!window.__cfRLUnblockHandlers) return false; window.location.href=this.value" data-cf-modified-5aeb4253f0e4e65f25b139af->
<option disabled selected>Select Date</option>
</div>
<div class="table-wrapper table-responsive">
<option value="https://ablefast.com/results/2024-05-04">04-May-2024</option>

Also, https://html.spec.whatwg.org/#parsing-main-inselect:has-an-element-in-select-scope would need to be changed to correctly handle <select> and </select>, since it currently assumes there won't be other elements on the stack (except in the fragment case). Probably it should be more like table scope (with with select instead of table).

@josepharhar
Copy link
Contributor Author

Maybe we could keep those special-cases and accept other tags in option? Maybe even in select? Keeping the "break out of select" tags as-is seems especially relevant for web compat.

Keeping some special cases sound reasonable, but allowing other tags within option without the datalist opt-in would break top.ge (mentioned in the issue description) because it has <a> and <b> tags inside <option>s, then pulls the DOM out and renders it outside of the <select>.

@zcorpan
Copy link
Member

zcorpan commented May 6, 2024

top.ge is affected by both option 1 and option 3. But it would still be usable, right? I tested inserting the a and b elements in devtools and the widget still worked.

@josepharhar
Copy link
Contributor Author

top.ge is affected by both option 1 and option 3. But it would still be usable, right? I tested inserting the a and b elements in devtools and the widget still worked.

Thanks for taking a look, I really hope I'm wrong about this!

I'm really not sure exactly what the site is doing, but with my experimental patch to make the parser allow anything, this is what the site looks like:
Screenshot 2024-05-06 at 10 43 54 AM

The HTML of each item in that dropdown list before my patch looks like this:

<li data-original-index="0" class="selected">
  <a tabindex="0" class="" data-tokens="null" role="option" aria-disabled="false" aria-selected="true">
    <span class="text">  ყველა  (19576) </span>
    <span class="glyphicon glyphicon-ok check-mark"></span>
  </a>
</li>

And with the patch, it looks like this (I manually edited the whitespace/newlines to make it readable):

<li data-original-index="0" class="selected">
  <a tabindex="0" class="" data-tokens="null" role="option" aria-disabled="false" aria-selected="true">
    <span class="text"> </span>
  </a>
  <a tabindex="0" aria-disabled="false" aria-selected="true"> ყველა </a>
  (<b>19576</b>)
  <span class="glyphicon glyphicon-ok check-mark"></span>
</li>

@zcorpan
Copy link
Member

zcorpan commented May 6, 2024

OK, yeah that looks pretty broken, though still functional.

Regarding "break out of select" tags, is it a requirement to be able to use input in select? Is it a requirement to be able to use table in select in table (maybe a flight seat picker or so)? I don't see such examples in https://open-ui.org/components/selectlist/#use-cases or https://microsoftedge.github.io/Demos/selectlist/index.html

@LeaVerou
Copy link

LeaVerou commented May 8, 2024

Writing this with my TAG hat on, and is the same feedback I would give during a design review.

Requiring an opt-in so that HTML content can work in the predictable, obvious way is author-hostile. Elements being dropped within <select> and <option> has historically been the source of a lot of author pain, even outside styling use cases (e.g. in scripting). Fixing it by introducing more syntax means the default behavior continues to be unintuitive, while authors need to learn yet another trick so their HTML works in the expected way. Opting to hold up such a confusing behavior to avoid changing how unclosed <select>s render in a tiny fraction of obscure websites is …the wrong weighing of priorities IMO.

Not to mention it makes it impossible to use this feature via progressive enhancement, since it completely breaks <select> in browsers that don’t support this. This is not necessarily a showstopper by itself, but it will slow down adoption considerably which also means that by the time we have enough author feedback about using it on real-world scenarios, it will be too late to change it in any way.

It might have been a reasonable weighing of tradeoffs if the use counter showed a much higher percentage, but 0.3% is actually encouraging as an upper bound for breakage. I suspect for many, if not most, cases within that 0.3%, the change will actually fix content that is currently broken. E.g. in the example given of an unclosed <select> that is eating up the rest of the page, I’d argue the change actually makes the content more accessible. And as it was pointed out, in many other cases, the content is still functional, even if it looks broken.


Looking at the compat analysis document (thanks @josepharhar for such a detailed report!), there are 19 websites in there of which 13 are actually unaffected by the change (green). Of the 6 that are marked yellow or red:

  • 3 are hidden with display: none, so not sure why they are marked yellow. If they are hidden, then the change does not negatively affect them and thus they should be green. What am I missing?
  • And in the remaining 3, it appears the select is still functional, just would not look pretty, which is probably appropriate since it actually does have this HTML content inside it.

Looking at chromestatus it appears there are 117 websites total that do this. How were the 19 in the doc selected? Is it the 19 most popular ones? The 19 first ones?

@josepharhar
Copy link
Contributor Author

Is it a requirement to be able to use table in select in table (maybe a flight seat picker or so)?

The only reason I can think of to use a table in a select would be for laying out the options in a table.

Requiring an opt-in so that HTML content can work in the predictable, obvious way is author-hostile

I agree. I am becoming increasingly convinced that we should do option 1 and deal with the compat fallout.

3 are hidden with display: none, so not sure why they are marked yellow. If they are hidden, then the change does not negatively affect them and thus they should be green. What am I missing?

I agree, you're not missing anything. I was just taking some notes without thinking that I'd publish them and decided to use yellow for these websites since I couldn't figure out if my changes would make a difference or not.

And in the remaining 3, it appears the select is still functional, just would not look pretty, which is probably appropriate since it actually does have this HTML content inside it.

Yeah the only one which seemed more impacted was http://tx.7ma.cn/ since there is a checkbox and submit button which no longer get rendered.

Looking at chromestatus it appears there are 117 websites total that do this. How were the 19 in the doc selected? Is it the 19 most popular ones? The 19 first ones?

If I remember correctly it was just the first 19 ones. I also skimmed the list for websites I could recognize, which there were not many of.

@lukewarlow
Copy link
Member

is it a requirement to be able to use input in select?

I can see this being useful for building searchable selects. Like a bounded version of input + datalist?

@LeaVerou
Copy link

LeaVerou commented May 9, 2024

is it a requirement to be able to use input in select?

I can see this being useful for building searchable selects. Like a bounded version of input + datalist?

That would require far more extensive changes than the parser changes described here to become possible. Also, it's such a common need that it should be possible without having to roll your own UX (in fact, I'd argue browsers should implement it by default for selects with more than say 20 options).

chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Jul 17, 2024
This patch makes <select> allow tags besides <option>, <optgroup>, and
<hr>. Previously this was only allowed within a child <button> or
<datalist> tag inside <select>, but based on the feedback in whatwg we
should try to allow this content everywhere:
whatwg/html#10310

This behavior is guarded behind a flag. Since I am planning on shipping
parser changes for <select> before appearance:base-select, I am creating
a new flag for parser changes instead of reusing the existing
StylableSelect flag for appearance:base-select. The new flag is intended
to not only make the parser change, but also update the algorithms which
associate option/optgroup/hr elements with select elements to account
for the newly parsed elements.

If everything goes well, then we will need to change these WPTs which
this patch effectively marks as failing:
html/infrastructure/common-dom-interfaces/collections/htmloptionscollection.html
html/semantics/forms/the-select-element/select-value.html
html/syntax/parsing/

Bug: 1511354
Change-Id: I441f9645a592ac63764fef928e4e5acf3fdec5db
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5518837
Commit-Queue: Joey Arhar <jarhar@chromium.org>
Reviewed-by: David Baron <dbaron@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1329137}
sadym-chromium pushed a commit to web-platform-tests/wpt that referenced this issue Jul 18, 2024
This patch makes <select> allow tags besides <option>, <optgroup>, and
<hr>. Previously this was only allowed within a child <button> or
<datalist> tag inside <select>, but based on the feedback in whatwg we
should try to allow this content everywhere:
whatwg/html#10310

This behavior is guarded behind a flag. Since I am planning on shipping
parser changes for <select> before appearance:base-select, I am creating
a new flag for parser changes instead of reusing the existing
StylableSelect flag for appearance:base-select. The new flag is intended
to not only make the parser change, but also update the algorithms which
associate option/optgroup/hr elements with select elements to account
for the newly parsed elements.

If everything goes well, then we will need to change these WPTs which
this patch effectively marks as failing:
html/infrastructure/common-dom-interfaces/collections/htmloptionscollection.html
html/semantics/forms/the-select-element/select-value.html
html/syntax/parsing/

Bug: 1511354
Change-Id: I441f9645a592ac63764fef928e4e5acf3fdec5db
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5518837
Commit-Queue: Joey Arhar <jarhar@chromium.org>
Reviewed-by: David Baron <dbaron@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1329137}
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Jul 19, 2024
Automatic update from web-platform-tests
Relax <select> parser rules

This patch makes <select> allow tags besides <option>, <optgroup>, and
<hr>. Previously this was only allowed within a child <button> or
<datalist> tag inside <select>, but based on the feedback in whatwg we
should try to allow this content everywhere:
whatwg/html#10310

This behavior is guarded behind a flag. Since I am planning on shipping
parser changes for <select> before appearance:base-select, I am creating
a new flag for parser changes instead of reusing the existing
StylableSelect flag for appearance:base-select. The new flag is intended
to not only make the parser change, but also update the algorithms which
associate option/optgroup/hr elements with select elements to account
for the newly parsed elements.

If everything goes well, then we will need to change these WPTs which
this patch effectively marks as failing:
html/infrastructure/common-dom-interfaces/collections/htmloptionscollection.html
html/semantics/forms/the-select-element/select-value.html
html/syntax/parsing/

Bug: 1511354
Change-Id: I441f9645a592ac63764fef928e4e5acf3fdec5db
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5518837
Commit-Queue: Joey Arhar <jarhar@chromium.org>
Reviewed-by: David Baron <dbaron@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1329137}

--

wpt-commits: 8e71820a49e67752c5f9ef6fad7a52753b004b5a
wpt-pr: 47179
ErichDonGubler pushed a commit to erichdongubler-mozilla/firefox that referenced this issue Jul 19, 2024
Automatic update from web-platform-tests
Relax <select> parser rules

This patch makes <select> allow tags besides <option>, <optgroup>, and
<hr>. Previously this was only allowed within a child <button> or
<datalist> tag inside <select>, but based on the feedback in whatwg we
should try to allow this content everywhere:
whatwg/html#10310

This behavior is guarded behind a flag. Since I am planning on shipping
parser changes for <select> before appearance:base-select, I am creating
a new flag for parser changes instead of reusing the existing
StylableSelect flag for appearance:base-select. The new flag is intended
to not only make the parser change, but also update the algorithms which
associate option/optgroup/hr elements with select elements to account
for the newly parsed elements.

If everything goes well, then we will need to change these WPTs which
this patch effectively marks as failing:
html/infrastructure/common-dom-interfaces/collections/htmloptionscollection.html
html/semantics/forms/the-select-element/select-value.html
html/syntax/parsing/

Bug: 1511354
Change-Id: I441f9645a592ac63764fef928e4e5acf3fdec5db
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5518837
Commit-Queue: Joey Arhar <jarhar@chromium.org>
Reviewed-by: David Baron <dbaron@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1329137}

--

wpt-commits: 8e71820a49e67752c5f9ef6fad7a52753b004b5a
wpt-pr: 47179
marcoscaceres pushed a commit to web-platform-tests/wpt that referenced this issue Jul 22, 2024
This patch makes <select> allow tags besides <option>, <optgroup>, and
<hr>. Previously this was only allowed within a child <button> or
<datalist> tag inside <select>, but based on the feedback in whatwg we
should try to allow this content everywhere:
whatwg/html#10310

This behavior is guarded behind a flag. Since I am planning on shipping
parser changes for <select> before appearance:base-select, I am creating
a new flag for parser changes instead of reusing the existing
StylableSelect flag for appearance:base-select. The new flag is intended
to not only make the parser change, but also update the algorithms which
associate option/optgroup/hr elements with select elements to account
for the newly parsed elements.

If everything goes well, then we will need to change these WPTs which
this patch effectively marks as failing:
html/infrastructure/common-dom-interfaces/collections/htmloptionscollection.html
html/semantics/forms/the-select-element/select-value.html
html/syntax/parsing/

Bug: 1511354
Change-Id: I441f9645a592ac63764fef928e4e5acf3fdec5db
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5518837
Commit-Queue: Joey Arhar <jarhar@chromium.org>
Reviewed-by: David Baron <dbaron@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1329137}
josepharhar added a commit to josepharhar/open-ui that referenced this issue Jul 29, 2024
Based on discussion and developer feedback in the WHATWG issue about
HTML parser changes for <select>, we are not requiring <datalist> to be
added to the author HTML for most cases. This PR updates the explainer
to remove <datalist> in a bunch of places.

whatwg/html#10310
josepharhar added a commit to openui/open-ui that referenced this issue Aug 6, 2024
* Update HTML parser changes in select explainer

Based on discussion and developer feedback in the WHATWG issue about
HTML parser changes for <select>, we are not requiring <datalist> to be
added to the author HTML for most cases. This PR updates the explainer
to remove <datalist> in a bunch of places.

whatwg/html#10310

* say slotted instead of put in
@josepharhar josepharhar linked a pull request Aug 13, 2024 that will close this issue
5 tasks
@josepharhar
Copy link
Contributor Author

I created a spec PR here, review would be appreciated: #10557

josepharhar added a commit to josepharhar/html that referenced this issue Aug 19, 2024
This patch makes the parser allow additional tags in <select> besides
<option>, <optgroup>, and <hr>, mostly by removing the "in select" and
"in select in table" parser modes.

In order to replicate the behavior where opening a <select> tag within
another open <select> tag inserts a </select> close tag, a traversal
through the stack of open elements was added which I borrowed from the
<button> part of the parser.

This will need test changes to be implemented in html5lib.

Fixes whatwg#10310
josepharhar added a commit to josepharhar/html that referenced this issue Aug 21, 2024
This patch makes the parser allow additional tags in <select> besides
<option>, <optgroup>, and <hr>, mostly by removing the "in select" and
"in select in table" parser modes.

In order to replicate the behavior where opening a <select> tag within
another open <select> tag inserts a </select> close tag, a traversal
through the stack of open elements was added which I borrowed from the
<button> part of the parser.

This will need test changes to be implemented in html5lib.

Fixes whatwg#10310
@zcorpan
Copy link
Member

zcorpan commented Aug 22, 2024

Analysis of the mXSS risk with this change:

Current parsers will ignore most tags in select, and the new parser will not. This allows for getting different parser state between new and legacy parsers, which can be used for mXSS. For example, see https://software.hixie.ch/utilities/js/live-dom-viewer/saved/13018 (targeting legacy browsers where the sanitizer uses the new parsing, but the other way around is probably also possible).

However, as far as I can tell an mXSS attack still needs something else like a RAWTEXT element to be unfiltered, which is more general and e.g. DOMPurify checks for (demo showing DOMPurify mitigating this attack).

Conclusion: sanitizers need to have general protection against mXSS, and so this change doesn't introduce new mXSS vectors, even though the parsed tree can be very different between new and legacy parsers.

josepharhar added a commit to josepharhar/html that referenced this issue Aug 28, 2024
This patch makes the parser allow additional tags in <select> besides
<option>, <optgroup>, and <hr>, mostly by removing the "in select" and
"in select in table" parser modes.

In order to replicate the behavior where opening a <select> tag within
another open <select> tag inserts a </select> close tag, a traversal
through the stack of open elements was added which I borrowed from the
<button> part of the parser.

This will need test changes to be implemented in html5lib.

Fixes whatwg#10310
josepharhar added a commit to josepharhar/html that referenced this issue Sep 5, 2024
This patch makes the parser allow additional tags in <select> besides
<option>, <optgroup>, and <hr>, mostly by removing the "in select" and
"in select in table" parser modes.

In order to replicate the behavior where opening a <select> tag within
another open <select> tag inserts a </select> close tag, a traversal
through the stack of open elements was added which I borrowed from the
<button> part of the parser.

This will need test changes to be implemented in html5lib.

Fixes whatwg#10310
josepharhar added a commit to josepharhar/html that referenced this issue Sep 10, 2024
This patch makes the parser allow additional tags in <select> besides
<option>, <optgroup>, and <hr>, mostly by removing the "in select" and
"in select in table" parser modes.

In order to replicate the behavior where opening a <select> tag within
another open <select> tag inserts a </select> close tag, a traversal
through the stack of open elements was added which I borrowed from the
<button> part of the parser.

This will need test changes to be implemented in html5lib.

Fixes whatwg#10310
josepharhar added a commit to josepharhar/html that referenced this issue Sep 10, 2024
This patch makes the parser allow additional tags in <select> besides
<option>, <optgroup>, and <hr>, mostly by removing the "in select" and
"in select in table" parser modes.

In order to replicate the behavior where opening a <select> tag within
another open <select> tag inserts a </select> close tag, a traversal
through the stack of open elements was added which I borrowed from the
<button> part of the parser.

This will need test changes to be implemented in html5lib.

Fixes whatwg#10310
@josepharhar josepharhar changed the title HTML parser changes for stylable <select> HTML parser changes for customizable <select> Oct 3, 2024
aarongable pushed a commit to chromium/chromium that referenced this issue Oct 16, 2024
This flag is intended to de-risk the launch of SelectParserRelaxation by
partially reverting the new parser behavior to the old parser behavior
specifically in the case of an <input> tag being parsed inside a
<select>. The old parser would convert <select><input> into
<select></select><input>, and based on my research, this is the case
that is most likely going to break sites in SelectParserRelaxation:
whatwg/html#10310

Bug: 373672164
Change-Id: I33b40d11c2001092aa076a219dd56c5ea86f13f6
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5936092
Reviewed-by: Mason Freed <masonf@chromium.org>
Commit-Queue: Joey Arhar <jarhar@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1369676}
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Oct 16, 2024
This flag is intended to de-risk the launch of SelectParserRelaxation by
partially reverting the new parser behavior to the old parser behavior
specifically in the case of an <input> tag being parsed inside a
<select>. The old parser would convert <select><input> into
<select></select><input>, and based on my research, this is the case
that is most likely going to break sites in SelectParserRelaxation:
whatwg/html#10310

Bug: 373672164
Change-Id: I33b40d11c2001092aa076a219dd56c5ea86f13f6
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5936092
Reviewed-by: Mason Freed <masonf@chromium.org>
Commit-Queue: Joey Arhar <jarhar@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1369676}
chromium-wpt-export-bot pushed a commit to web-platform-tests/wpt that referenced this issue Oct 16, 2024
This flag is intended to de-risk the launch of SelectParserRelaxation by
partially reverting the new parser behavior to the old parser behavior
specifically in the case of an <input> tag being parsed inside a
<select>. The old parser would convert <select><input> into
<select></select><input>, and based on my research, this is the case
that is most likely going to break sites in SelectParserRelaxation:
whatwg/html#10310

Bug: 373672164
Change-Id: I33b40d11c2001092aa076a219dd56c5ea86f13f6
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5936092
Reviewed-by: Mason Freed <masonf@chromium.org>
Commit-Queue: Joey Arhar <jarhar@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1369676}
aarongable pushed a commit to chromium/chromium that referenced this issue Oct 19, 2024
This flag is intended to de-risk the launch of SelectParserRelaxation by
partially reverting the new parser behavior to the old parser behavior
specifically in the case of an <input> tag being parsed inside a
<select>. The old parser would convert <select><input> into
<select></select><input>, and based on my research, this is the case
that is most likely going to break sites in SelectParserRelaxation:
whatwg/html#10310

(cherry picked from commit 9776ce6)

Bug: 373672164
Change-Id: I33b40d11c2001092aa076a219dd56c5ea86f13f6
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5936092
Reviewed-by: Mason Freed <masonf@chromium.org>
Commit-Queue: Joey Arhar <jarhar@chromium.org>
Cr-Original-Commit-Position: refs/heads/main@{#1369676}
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5941877
Auto-Submit: Joey Arhar <jarhar@chromium.org>
Commit-Queue: Mason Freed <masonf@chromium.org>
Cr-Commit-Position: refs/branch-heads/6778@{#238}
Cr-Branched-From: b21671c-refs/heads/main@{#1368529}
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this issue Oct 21, 2024
Automatic update from web-platform-tests
Add InputClosesSelect flag

This flag is intended to de-risk the launch of SelectParserRelaxation by
partially reverting the new parser behavior to the old parser behavior
specifically in the case of an <input> tag being parsed inside a
<select>. The old parser would convert <select><input> into
<select></select><input>, and based on my research, this is the case
that is most likely going to break sites in SelectParserRelaxation:
whatwg/html#10310

Bug: 373672164
Change-Id: I33b40d11c2001092aa076a219dd56c5ea86f13f6
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5936092
Reviewed-by: Mason Freed <masonf@chromium.org>
Commit-Queue: Joey Arhar <jarhar@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1369676}

--

wpt-commits: 00e1df7e329f3d11b91d7b2e11a2db63bbd98ef9
wpt-pr: 48658
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this issue Oct 22, 2024
Automatic update from web-platform-tests
Add InputClosesSelect flag

This flag is intended to de-risk the launch of SelectParserRelaxation by
partially reverting the new parser behavior to the old parser behavior
specifically in the case of an <input> tag being parsed inside a
<select>. The old parser would convert <select><input> into
<select></select><input>, and based on my research, this is the case
that is most likely going to break sites in SelectParserRelaxation:
whatwg/html#10310

Bug: 373672164
Change-Id: I33b40d11c2001092aa076a219dd56c5ea86f13f6
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5936092
Reviewed-by: Mason Freed <masonfchromium.org>
Commit-Queue: Joey Arhar <jarharchromium.org>
Cr-Commit-Position: refs/heads/main{#1369676}

--

wpt-commits: 00e1df7e329f3d11b91d7b2e11a2db63bbd98ef9
wpt-pr: 48658

UltraBlame original commit: fdce5c8cbadca9d2447fe36c71dcd93160946cda
gecko-dev-updater pushed a commit to marco-c/gecko-dev-comments-removed that referenced this issue Oct 22, 2024
Automatic update from web-platform-tests
Add InputClosesSelect flag

This flag is intended to de-risk the launch of SelectParserRelaxation by
partially reverting the new parser behavior to the old parser behavior
specifically in the case of an <input> tag being parsed inside a
<select>. The old parser would convert <select><input> into
<select></select><input>, and based on my research, this is the case
that is most likely going to break sites in SelectParserRelaxation:
whatwg/html#10310

Bug: 373672164
Change-Id: I33b40d11c2001092aa076a219dd56c5ea86f13f6
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5936092
Reviewed-by: Mason Freed <masonfchromium.org>
Commit-Queue: Joey Arhar <jarharchromium.org>
Cr-Commit-Position: refs/heads/main{#1369676}

--

wpt-commits: 00e1df7e329f3d11b91d7b2e11a2db63bbd98ef9
wpt-pr: 48658

UltraBlame original commit: fdce5c8cbadca9d2447fe36c71dcd93160946cda
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this issue Oct 22, 2024
Automatic update from web-platform-tests
Add InputClosesSelect flag

This flag is intended to de-risk the launch of SelectParserRelaxation by
partially reverting the new parser behavior to the old parser behavior
specifically in the case of an <input> tag being parsed inside a
<select>. The old parser would convert <select><input> into
<select></select><input>, and based on my research, this is the case
that is most likely going to break sites in SelectParserRelaxation:
whatwg/html#10310

Bug: 373672164
Change-Id: I33b40d11c2001092aa076a219dd56c5ea86f13f6
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5936092
Reviewed-by: Mason Freed <masonfchromium.org>
Commit-Queue: Joey Arhar <jarharchromium.org>
Cr-Commit-Position: refs/heads/main{#1369676}

--

wpt-commits: 00e1df7e329f3d11b91d7b2e11a2db63bbd98ef9
wpt-pr: 48658

UltraBlame original commit: fdce5c8cbadca9d2447fe36c71dcd93160946cda
ErichDonGubler pushed a commit to erichdongubler-mozilla/firefox that referenced this issue Oct 23, 2024
Automatic update from web-platform-tests
Add InputClosesSelect flag

This flag is intended to de-risk the launch of SelectParserRelaxation by
partially reverting the new parser behavior to the old parser behavior
specifically in the case of an <input> tag being parsed inside a
<select>. The old parser would convert <select><input> into
<select></select><input>, and based on my research, this is the case
that is most likely going to break sites in SelectParserRelaxation:
whatwg/html#10310

Bug: 373672164
Change-Id: I33b40d11c2001092aa076a219dd56c5ea86f13f6
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/5936092
Reviewed-by: Mason Freed <masonf@chromium.org>
Commit-Queue: Joey Arhar <jarhar@chromium.org>
Cr-Commit-Position: refs/heads/main@{#1369676}

--

wpt-commits: 00e1df7e329f3d11b91d7b2e11a2db63bbd98ef9
wpt-pr: 48658
@annevk annevk added the topic: select The <select> element label Nov 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements topic: forms topic: parser topic: select The <select> element
Development

Successfully merging a pull request may close this issue.

10 participants