Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New attribute to control UA-provided writing assistance #9065

Closed
bmathwig opened this issue Mar 22, 2023 · 47 comments
Closed

New attribute to control UA-provided writing assistance #9065

bmathwig opened this issue Mar 22, 2023 · 47 comments
Labels
addition/proposal New features or enhancements agenda+ To be discussed at a triage meeting needs concrete proposal Moving the issue forward requires someone to figure out a detailed plan topic: forms

Comments

@bmathwig
Copy link

bmathwig commented Mar 22, 2023

The current specification allows for the autocomplete attribute to exist on elements of type <input>, <textarea>, and <select>. With the rise in popularity of rich text controls using contenteditable, we should consider allowing elements who have contenteditable=true to utilize the autocomplete attribute. While not a common scenario within the scope of form fields, there are applications for text hinting and autofill within contenteditable elements.

One existing example of a form field being replaced by contenteditable exists in the example section of its specification.

@annevk annevk added addition/proposal New features or enhancements needs implementer interest Moving the issue forward requires implementers to express interest topic: forms labels Mar 23, 2023
@annevk
Copy link
Member

annevk commented Mar 23, 2023

cc @whatwg/forms

@bmathwig
Copy link
Author

Microsoft is interested in implementing this in Edge and Chromium :)

@zcorpan
Copy link
Member

zcorpan commented Mar 27, 2023

cc @DimiDL @galich @masayuki-nakano

@zcorpan
Copy link
Member

zcorpan commented Mar 28, 2023

Autocomplete works differently on different form controls, see https://html.spec.whatwg.org/#inappropriate-for-the-control

Editing hosts don't have a way to signal what kind of input is accepted (e.g. single line vs multiline). Which "groups" should editing hosts be part of? All of them?

@annevk annevk removed the needs implementer interest Moving the issue forward requires implementers to express interest label Mar 28, 2023
@annevk
Copy link
Member

annevk commented Mar 28, 2023

WebKit is also interested in this.

(For maximum clarity, Chromium and Edge count as a single implementer for WHATWG purposes.)

@annevk annevk added the needs concrete proposal Moving the issue forward requires someone to figure out a detailed plan label Mar 28, 2023
@bmathwig
Copy link
Author

Autocomplete works differently on different form controls, see https://html.spec.whatwg.org/#inappropriate-for-the-control

Editing hosts don't have a way to signal what kind of input is accepted (e.g. single line vs multiline). Which "groups" should editing hosts be part of? All of them?

I think contenteditable will always be a Text-Multiline host. I can't think of any cases where the other groups would apply. We also may want to expand the Field Name to include additional types of content in the future.

@bmathwig
Copy link
Author

Here is our proposal to adjust the wording of 4.10.18.7.1 Autofill to allow for editing host elements to be eligible for autofill and the autocomplete attribute.

https://github.com/MicrosoftEdge/MSEdgeExplainers/blob/main/AutocompleteContentEditable/explainer.md

@sanketj
Copy link
Member

sanketj commented Jul 20, 2023

@annevk, @domenic, @mfreed7, @zcorpan: Curious to hear your thoughts on the proposal Ben shared above.

@mfreed7
Copy link
Collaborator

mfreed7 commented Aug 3, 2023

@annevk, @domenic, @mfreed7, @zcorpan: Curious to hear your thoughts on the proposal Ben shared above.

Seems reasonable, but I'm not an autocomplete expert. I do worry about the leakage of sensitive information, if autocomplete can more easily be tricked into filling general <div> with PII. But as mentioned, that risk already exists with <input> so I'm not sure why this would be worse. @battre would have better input from Chrome's side.

@annevk
Copy link
Member

annevk commented Aug 25, 2023

What are the considerations around events? contenteditable involves quite a bit more events, do any need to be simulated when autofilling? The solution needs to address that somehow.

cc @johanneswilm

@johanneswilm
Copy link

@bmathwig What kind of input type are you thinking of using for the corresponding beforeinoput and input events? And how does this fit with Microsoft's plans with replacing a lot of contenteditable usage with EditContext which Microsoft is also working on?

In the examples mentioned in your proposal, where would the suggestions of autocompletion for the code editor come from? Is this one of the existing autocomplete types like address, etc.? Would it work in the middle of an element with other content preceding and following it or will it replace the entire content of the contenteditable element?

@johanneswilm
Copy link

@bmathwig Also, will the auto-complete text that is stored in the browser contain richtext itself? So if the user fills in their address in one place and uses <b> around the last name, will there be a <b>-tag around the last name when reinserting it somewhere else? If yes, how will that work if inserting into a different website where the editor uses <strong> instead of <b>? And how about code editors where styling is used differently from editor to editor?

@battre
Copy link

battre commented Aug 25, 2023

tl:dr; I have a couple of concerns about this proposal which basically boil down to the point that the proposal endorses the use of <div contenteditable> for something that's semantically a form control but does not feel and behave like a form control anymore. I would prefer if websites used form controls for forms and <div contenteditable> for editable content. Otherwise, I think that autofill may work worse than today.

Here are the details.

We observe that

  • the majority of <input> elements don't have autocomplete attributes,
  • the current autocomplete spec is not expressive enough for addresses in most countries.

Chrome compensates for these problems as well as possible by running heuristics in the browser and crowdsourcing. My concern is that if <div contenteditable> becomes the new <input> (either because libraries use it or because it's the new best practice you find on stackoverflow), we may lose the capability to classify fields.

  1. Semantic grouping: Today most form controls that belong together are semantically grouped via <form> tags. I expect that this would become less the case if people don't think in terms of forms but in terms of <div contenteditable>s because they cannot be associated with a <form> tag and look and feel like layout components, not like form components.
    With the loss of <form> tags our client-side heuristics would struggle to find the boundaries between semantically unrelated forms (search box, login form, sign-up form, shipping address form, chat box, ...) which can co-exist on a website. (Not a new problem but one that will become more pronounced).

  2. Loss of signals for heuristics: Developer documentation for <input> elements suggests to assign name attributes to fields and we see that developers do this a lot (even though they may submit the data via Fetch - after all the Internet is made of copy&paste from tutorials ;-)). This gives us semantic hints about the meaning of fields.
    With the loss of <form> tags and name attributes, Chrome would lose the capability to do meaningful crowdsourcing of field semantics (a "form" becomes harder to reference) and the heuristics would lose an important signal that helps assigning meaning to a field. (Not a new problem but one that will become more pronounced).

  3. Form submission detection is hard if we don't have a <form> that's POSTed via a submit(). We have built many complex heuristics as proxies for candidates for form submission events, such as checking whether a <form> is taken out of the DOM or made invisible. This, again, would become more brittle if we didn't have <form>s.
    With the loss of <form> tags it would become increasingly difficult to see submissions, which we use to ask the user whether they want to save their saved password, credit card, address, ...

In summary, I believe that are be better off if fields that are semantically parts of a form remain form controls.

If <textarea> is not styleable enough, could we introduce <textarea richcontent> or something like that remains a form control and is associated with a <form> but can have DOM children like a <div contenteditable>? That might be nice from the perspective of posting a form with Fetch and go in line with <selectlist>s, which make <form>s more powerful rather than pushing users to custom solutions built from <div>s.

All that said, @johanneswilm raises a lot of good questions that are also unclear to me and would pertain to such a <textarea richcontent>.

@sanketj
Copy link
Member

sanketj commented Aug 29, 2023

tl:dr; I have a couple of concerns about this proposal which basically boil down to the point that the proposal endorses the use of <div contenteditable> for something that's semantically a form control but does not feel and behave like a form control anymore. I would prefer if websites used form controls for forms and <div contenteditable> for editable content. Otherwise, I think that autofill may work worse than today.

Here are the details.

We observe that

  • the majority of <input> elements don't have autocomplete attributes,
  • the current autocomplete spec is not expressive enough for addresses in most countries.

Chrome compensates for these problems as well as possible by running heuristics in the browser and crowdsourcing. My concern is that if <div contenteditable> becomes the new <input> (either because libraries use it or because it's the new best practice you find on stackoverflow), we may lose the capability to classify fields.

  1. Semantic grouping: Today most form controls that belong together are semantically grouped via <form> tags. I expect that this would become less the case if people don't think in terms of forms but in terms of <div contenteditable>s because they cannot be associated with a <form> tag and look and feel like layout components, not like form components.
    With the loss of <form> tags our client-side heuristics would struggle to find the boundaries between semantically unrelated forms (search box, login form, sign-up form, shipping address form, chat box, ...) which can co-exist on a website. (Not a new problem but one that will become more pronounced).
  2. Loss of signals for heuristics: Developer documentation for <input> elements suggests to assign name attributes to fields and we see that developers do this a lot (even though they may submit the data via Fetch - after all the Internet is made of copy&paste from tutorials ;-)). This gives us semantic hints about the meaning of fields.
    With the loss of <form> tags and name attributes, Chrome would lose the capability to do meaningful crowdsourcing of field semantics (a "form" becomes harder to reference) and the heuristics would lose an important signal that helps assigning meaning to a field. (Not a new problem but one that will become more pronounced).
  3. Form submission detection is hard if we don't have a <form> that's POSTed via a submit(). We have built many complex heuristics as proxies for candidates for form submission events, such as checking whether a <form> is taken out of the DOM or made invisible. This, again, would become more brittle if we didn't have <form>s.
    With the loss of <form> tags it would become increasingly difficult to see submissions, which we use to ask the user whether they want to save their saved password, credit card, address, ...

In summary, I believe that are be better off if fields that are semantically parts of a form remain form controls.

If <textarea> is not styleable enough, could we introduce <textarea richcontent> or something like that remains a form control and is associated with a <form> but can have DOM children like a <div contenteditable>? That might be nice from the perspective of posting a form with Fetch and go in line with <selectlist>s, which make <form>s more powerful rather than pushing users to custom solutions built from <div>s.

All that said, @johanneswilm raises a lot of good questions that are also unclear to me and would pertain to such a <textarea richcontent>.

Thanks @battre! The intent of this proposal is not to make contenteditable elements targets for form fill (although it technically already can be today - see below). Rather, it is to extend the scope of the autocomplete attribute beyond just form autofill scenarios. For editable regions, the use cases for autocomplete are mainly for writing assistance to allow the user to write faster, not necessarily for filling forms.

There are a couple of subtleties in terms of interactions with form elements:

  • For text control elements (ex. textarea), UAs may use autocomplete for both writing assistance and form fill use cases.
  • It is technically possible for a contenteditable to be form associated if it is part of a form-associated custom element (https://html.spec.whatwg.org/multipage/forms.html#categories). For such cases, similar to other text control elements, UAs may use autocomplete on contenteditables to signal both writing assistance and form fill.

An alternative solution could be to create a new attribute to support "autocomplete for writing assistance" scenarios. However, since these are also autocompletion scenarios, it would be ideal to just reuse the existing autocomplete attribute.

@sanketj
Copy link
Member

sanketj commented Aug 29, 2023

What kind of input type are you thinking of using for the corresponding beforeinput and input events?

Autocompletion on input elements fires these events today. I would expect them to fire similarly for contenteditables.

In the examples mentioned in your proposal, where would the suggestions of autocompletion for the code editor come from?Is this one of the existing autocomplete types like address, etc.? Would it work in the middle of an element with other content preceding and following it or will it replace the entire content of the contenteditable element?

For existing autocomplete scenarios, the spec doesn't prescribe where the autocompletion comes from. Neither does it specify whether the autofilled content should replace existing content or just be inserted. This is left up to the UA. For writing assistance scenarios, it also seems reasonable to leave this up to the UA.

Also, will the auto-complete text that is stored in the browser contain richtext itself? So if the user fills in their address in one place and uses <b> around the last name, will there be a <b>-tag around the last name when reinserting it somewhere else? If yes, how will that work if inserting into a different website where the editor uses <strong> instead of <b>? And how about code editors where styling is used differently from editor to editor?

Yes, preserving rich text does seem quite tricky to get right and unclear how useful it would be. Do you have scenarios in mind where this might be desirable? Storing and inserting the autocomplete text as plain text seems sufficient.

@johanneswilm
Copy link

For existing autocomplete scenarios, the spec doesn't prescribe where the autocompletion comes from. Neither does it specify whether the autofilled content should replace existing content or just be inserted. This is left up to the UA. For writing assistance scenarios, it also seems reasonable to leave this up to the UA.

In that case, I would think it's a bad idea to try to add this to contenteditable. Contenteditable elements are generally controlled by thousands of lines of JavaScript that try to ensure that similar markupis produced all platforms and browsers. Firefox was the last browser to remove some major elements that worked differently from the other browsers (table controls). Introducing new issues that work differently doesn't seem like a good idea.

That's different for input[type=text] and textarea elements as they produce simple text. Even if the same text is first edited in one UA and then another, there is generally no problem if the UAs behave somewhat differently (with the exception of line endings in some scenarios, but those can be fixed with a single line of JavaScript).

Given that JS editors are such large programs also means they have plugins that provide auto-completion [1] that work highly specific for a given type of content. It seems like it would be difficult to create a one-size-fits-all model that is UA specific to replace all of these.

[1] For example https://ckeditor.com/cke4/addon/autocomplete or https://github.com/curvenote/editor/tree/main/packages/prosemirror-autocomplete

@sanketj
Copy link
Member

sanketj commented Aug 29, 2023

What are the considerations around events? contenteditable involves quite a bit more events, do any need to be simulated when autofilling? The solution needs to address that somehow.

What categories of events are you referring to? It seems like eventing for autocomplete should work similar to the user just replacing/inserting that content via manual input. Events for input, composition, etc. are already fired on text control elements like this, I would expect those events to also work the same way for contenteditables.

@sanketj
Copy link
Member

sanketj commented Aug 29, 2023

For existing autocomplete scenarios, the spec doesn't prescribe where the autocompletion comes from. Neither does it specify whether the autofilled content should replace existing content or just be inserted. This is left up to the UA. For writing assistance scenarios, it also seems reasonable to leave this up to the UA.

In that case, I would think it's a bad idea to try to add this to contenteditable. Contenteditable elements are generally controlled by thousands of lines of JavaScript that try to ensure that similar markupis produced all platforms and browsers. Firefox was the last browser to remove some major elements that worked differently from the other browsers (table controls). Introducing new issues that work differently doesn't seem like a good idea.

That's different for input[type=text] and textarea elements as they produce simple text. Even if the same text is first edited in one UA and then another, there is generally no problem if the UAs behave somewhat differently (with the exception of line endings in some scenarios, but those can be fixed with a single line of JavaScript).

Given that JS editors are such large programs also means they have plugins that provide auto-completion [1] that work highly specific for a given type of content. It seems like it would be difficult to create a one-size-fits-all model that is UA specific to replace all of these.

[1] For example https://ckeditor.com/cke4/addon/autocomplete or https://github.com/curvenote/editor/tree/main/packages/prosemirror-autocomplete

I would expect autocomplete to only support plain text content, perhaps that can be made explicit in the spec. Thus, this would work similar to the user manually replacing/inserting that same content via text input methods (ex. typing, composition), which wouldn't be site breaking.

@johanneswilm
Copy link

johanneswilm commented Aug 29, 2023

Yes, preserving rich text does seem quite tricky to get right and unclear how useful it would be. Do you have scenarios in mind where this might be desirable?

Looking at the kind of autocomplete existing richtext editors based on contenteditable do, they would for example put tags around a specific term that was inserted through auto-completion to give it a different color or style. The code editor mentioned in your explainer [1] would likely need to do that if it is supposed to work like other web-based code editors.

I would expect autocomplete to only support plain text content, perhaps that can be made explicit in the spec.

Ok that would remove one potential issue.

But if I understand you correctly, it would be up to the UA whether to replace the entire contents or just add something new, right? So the code editor in your example could work in Safari on Mac in a way where it would just suggest an entire code snippet to the user and then replace everything else in there, whereas on in Edge on Windows it may give suggestions for specific terms to be used within the code editor? If that is the case, who would opt for using this feature that is only working sometypes as required for some users rather than use one of the existing JavaScript code editors with an existing auto-complete plugin that work all the time and everywhere?

I'm thinking maybe the usecase for this is something else, such as an address input on a simple form field where the user wants to use a contenteditable element instead of a textarea for some reason - maybe because there are situations where there could be richtext in the address? That then carries with it the issues mentioned by @battre above.

[1] https://github.com/MicrosoftEdge/MSEdgeExplainers/blob/main/AutocompleteContentEditable/explainer.md

@sanketj
Copy link
Member

sanketj commented Aug 31, 2023

But if I understand you correctly, it would be up to the UA whether to replace the entire contents or just add something new, right? So the code editor in your example could work in Safari on Mac in a way where it would just suggest an entire code snippet to the user and then replace everything else in there, whereas on in Edge on Windows it may give suggestions for specific terms to be used within the code editor? If that is the case, who would opt for using this feature that is only working sometypes as required for some users rather than use one of the existing JavaScript code editors with an existing auto-complete plugin that work all the time and everywhere?

I'm thinking maybe the usecase for this is something else, such as an address input on a simple form field where the user wants to use a contenteditable element instead of a textarea for some reason - maybe because there are situations where there could be richtext in the address? That then carries with it the issues mentioned by @battre above.

[1] https://github.com/MicrosoftEdge/MSEdgeExplainers/blob/main/AutocompleteContentEditable/explainer.md

The scenarios for autocomplete on contenteditable are primarily about writing assistance, not form fill. I've updated the use cases section in the explainer, hopefully that helps. For the use cases I can imagine, they all seem to be about inserting content where the user is typing (replacing the user's selected text if necessary). My reasoning for leaving the decision about how to insert content into the DOM up to the UA is that autocomplete is about browser-powered functionality and it is unclear what browsers might come up with in the future. The intent is that regardless of what types of writing assistance UAs add, authors should be able to control it with the autocomplete attribute. The existing spec also does not prescribe how exactly autofill should insert content into the DOM, it just mentions that UAs must act "as if the user had modified the control's data". This allows a wide range of use cases to be supported.

@johanneswilm
Copy link

johanneswilm commented Aug 31, 2023

The scenarios for autocomplete on contenteditable are primarily about writing assistance, not form fill.

Ok, that makes more sense. So it looks like you are planning for a scenario where a UA or an operating system is providing something like text completion using a large language model (LLM) either by directly completing the text or by figuring out that this would be an appropriate place to insert the users phone number, credit card number or similar and then offering that as something to easily fill in.

I can see how it would make sense to signal to either the UA or browser extensions (like Grammarly and new similarly IA-based offerings) that this is a field where such assistance would be desired or it should not be offered. This kind of assistance is qualitatively different from spell-checking, so therefore you need to have two distinct keywords.

I wonder then, given that this usage is quite different from the form-filling help that the autocomplete attribute offers, whether it would not make sense to use a different term to make it less confusing. Maybe something like textcompletion?

I also think it should be made clear which input type (before input event and input event) will be used for this. There is called called insertReplacementText and the usage is described as "replace existing text by means of a spell checker, auto-correct or similar". I can see from the name of it that it was initially meant to be used for spell- and grammar checkers, but it would still seem like the most appropriate one. Else, maybe we need to add another type to the chart.

Under these circumstances, would it not also make sense to add this to EditContext in parallel? I have seen your notes on that, but the use cases you like, like a Facebook editor, already use a sophisticated and highly complex contenteditable based-editor that will also possibly be replaced by EditContext once shipped.

@johanneswilm
Copy link

From the explainer:

Many sophisticated editors that could benefit from the EditContext API also integrate their own writing assistance features and thus may opt out of browser-powered autocompletion (ex. Google Docs, Word Online). Therefore, it is unclear whether supporting the autocomplete attribute on EditContext editable hosts will be useful.

Most production level JS richtext editors on the web are quite sophisticated and will consist of thousands of lines of code and have 5-20 years of development behind them. However, a lot of these can be run completely in JavaScript (open source libraries such as CKEditor, ProseMirror, TinyMCE, etc.). And most of the more robust ones already do what EditContext promises in that they diff the dom after browser-initiated DOM changes and then potentially roll back some of those. Switching to EditContext will in many such cases mean a simplification of the code as one can skip diffing and rolling DOM changes back. So if and when EditContext actually ships eevrywhere, I would think that a lot of these libraries will eventually switch to it.

However, hosting a LLM on a server is a bit more complicated than serving a JS-based editor on a website. I would therefore think it makes a lot of sense to add both spell checking and this new feature also to EditContext. That would also be consistent with other decisions you made, such as adding using the native selection as an option to EditContext even though Google Docs and other larger online word processors don't make use of it, precisely because it is also to be useful for smaller sites.

@past past removed the agenda+ To be discussed at a triage meeting label Dec 15, 2023
@sanketj
Copy link
Member

sanketj commented Dec 15, 2023

Per resolution in #9966, Microsoft can draft up a spec PR for the new attribute. Unless there are strong objections, I plan to start with writingsuggestions as the name and on/off as values. Happy to continue discussions on the final name.

@sanketj
Copy link
Member

sanketj commented Jan 11, 2024

Thanks @mfreed7, @marcoscaceres, @domenic for the feedback on #10018. Calling out a few points about the new attribute that came out of that, which might be worth discussing in more detail during the next WHATNOT meeting.

  • true (enabled) by default on all elements
  • inheritance works across shadow boundaries
  • string type for getter/setter

@sanketj
Copy link
Member

sanketj commented Jan 11, 2024

Per discussion on the above points during today's WHATNOT call, the only suggested change to #10018 was to align the inheritance behavior with spellcheck, and address inheritance across shadow boundaries for both attributes (and possibly others) separately. I've updated that PR accordingly. Please let me know if there's additional feedback on that one.

@zcorpan
Copy link
Member

zcorpan commented Feb 2, 2024

This should define interaction with field-sizing CSS property, if writing suggestions can appear inline and contain potentially privacy-sensitive information. See #9903 (comment)

@annevk
Copy link
Member

annevk commented Feb 9, 2024

@sanketj Why does it work for type=email but not type=telephone?

@dandclark
Copy link
Contributor

@sanketj Why does it work for type=email but not type=telephone?

I guess the principle of the current list is "types that expect character input that's not numbers-only, excepting password".

@dandclark
Copy link
Contributor

This should define interaction with field-sizing CSS property, if writing suggestions can appear inline and contain potentially privacy-sensitive information. See #9903 (comment)

@zcorpan

The scope of this new attribute just grants authors the ability to turn off UAs' writing suggestions capabilities. An attempt hasn't been made to standardize the details of those capabilities, and they could vary widely. So while I agree that the interaction of field-sizing with UA writing suggestions and autofill needs to be worked out, it's not really in the purview of the writingsuggestions toggle attribute defined here.

Maybe worth a new issue?

@annevk
Copy link
Member

annevk commented Feb 10, 2024

@dandclark it seems weird to me to include email but not telephone. If this feature can suggest email addresses, surely it can suggest telephone numbers as well.

@dandclark
Copy link
Contributor

@annevk
@sanketj pointed out to me that the list of supported elements was originally taken from element types of the spellcheck attribute under User agents must only consider the following pieces of text as checkable for the purposes of this feature.

I think that list is a reasonable starting point, and given the current design in #10018 it could be expanded without breaking backwards compatibility, but I don't have any particular objection to adding "telephone".

@sanketj sanketj added the agenda+ To be discussed at a triage meeting label Feb 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addition/proposal New features or enhancements agenda+ To be discussed at a triage meeting needs concrete proposal Moving the issue forward requires someone to figure out a detailed plan topic: forms
Development

No branches or pull requests

10 participants