Skip to content
This repository has been archived by the owner. It is now read-only.

Matching input to datalist values #1011

Closed
r12a opened this issue Sep 8, 2017 · 3 comments
Closed

Matching input to datalist values #1011

r12a opened this issue Sep 8, 2017 · 3 comments
Assignees

Comments

@r12a
Copy link

@r12a r12a commented Sep 8, 2017

4.10.8. The datalist element
https://www.w3.org/TR/2017/WD-html52-20170406/sec-forms.html#the-datalist-element

The datalist element matches user input to options in a list. In current implementations in Chrome and Firefox, as you type into a text input field the browser suggests alternatives based on the text you type. This is extremely useful for working with long lists (such as country selection), or for helping the user find list items when they don't know the beginning of the option text.

However, the string matching involved can be taken to various levels of complexity when dealing with languages besides English, especially because this is matching of natural language data, rather than identifiers.

Current implementations will match 'This' to 'this', which seems reasonable. However, the i18n WG feels that browsers should as a minimum also normalise the text being compared prior to matching. We feel that this normalisation should not be compatibility normalisation (ie. it should be NFC or NFD, but not NFKC or NFKD).

We also believe that case-insensitive matching should take into account local tailoring, eg. so that Turkish i matches İ but not I.

Beyond that, however, there are many types of comparison that could be made, including things such as accent-stripping, matching kana with kanji, matching full- and half-width characters, etc.

The i18n WG proposes that the HTML spec:

  1. indicate that locale-tailored string matching and normalization SHOULD be done during comparison of input strings and datalist options, and
  2. include a note to alert browser implementers to the fact that they MAY want to consider additional methods of string matching for international users.
@aphillips

This comment has been minimized.

Copy link

@aphillips aphillips commented Sep 10, 2017

The problem here is the larger problem of "string searching" (for which we have a FPWD that we're not currently working on). This type of natural language text matching is somewhat more complex than substring or namespace name matching--including (and not limited to) case insensitivity issues. Browsers do have implementations of this kind of searching--which is not specified in HTML: it's the "find" feature.

Note that Richard wrote "matching kana with kanji", to which I would add matching katakana and hiragana (and the wide/narrow katakana equivalents should match). And to accent-stripping (the matched text), I would add respecting accepts (case-insensitively) when they form part of the user input.

@r12a

This comment has been minimized.

Copy link
Author

@r12a r12a commented Sep 18, 2017

[@aphillips please correct me if needed]

For the SHOULD text i think we're asking HTML to require implementers to:

  1. apply a set of Unicode rules for case matching in certain locales - such as match dotted i with upper case dotted i for Turkish
  2. use the Unicode standard normalization algorithms to normalise text where the same characters are represented in more than one way - eg. é as opposed to e+<acute accent>

Both those should be testable.

@siusin siusin added the needs tests label Sep 18, 2017
@chaals chaals added this to the HTML5.3 WD5 milestone Jun 19, 2018
@LJWatson LJWatson modified the milestones: HTML5.3 WD5, HTML5.3 WD6 Jul 30, 2018
@LJWatson LJWatson removed this from the HTML5.3 WD6 milestone Sep 13, 2018
@siusin

This comment has been minimized.

Copy link
Contributor

@siusin siusin commented Jul 29, 2019

We're closing this issue on the W3C HTML specification because the W3C and WHATWG are now working together on HTML, and all issues are being discussed on the WHATWG repository.

If you filed this issue and you still think it is relevant, please open a new issue on the WHATWG repository and reference this issue (if there is useful information here). Before you open a new issue, please check for existing issues on the WHATWG repository to avoid duplication.

If you have questions about this, please open an issue on the W3C HTML WG repository or send an email to public-html@w3.org.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
5 participants
You can’t perform that action at this time.