-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AccName trims whitespace but doesn't define which code points are whitespace #55
Comments
Using the whitespace definition from 'Flat string' should work for trimming at step 2C - but worth being explicit in 'Flat string' about the exact code points involved for consistency with other W3/WhatWG spec. |
One other thing to consider - the CSS spec is explicit about not rendering certain code points (Default_ignoreable) distinct from Unicode white_space. AccName should take this into consideration: https://drafts.csswg.org/css-text-3/#white-space-processing Without taking this into account you can have an element with an AccName that contains no visible glyphs:
The zero width no-break space U+FEFF pops up quite often because it also functions as the byte order mark in UTF-16 (every Unicode file saved by Windows Notepad starts with this character), so it's easy to get it into a page using
|
To add to this list (thanks for this!): In a recent ARIA WG telco we discussed if U+2800 (blank braille pattern) should be considered whitespace (in the context of w3c/aria#924 and the requirement for UAs to ignore role descriptions containing only whitespace). |
From discussion on ACT-R I think you probably need two different concepts:
|
My $0.02 would be to use the 'space-separated tokens' microsyntax for roles since this is used in lots of other places (e.g. the A counter-argument is this could make authoring harder for braille users, but the problem already exists for other HTML attributes like Plus there are already 4 incompatible whitespace definitions in HTML/CSS/XML... Edit: sorry mis-understood role descriptions above - ignore this comment |
Hi, As a baseline for Core AccName, add the following characters to explicitly define which are considered baseline whitespace characters.
Though this matches the HTML spec definition for whitespace, these characters exist within any text editor and don't require a specific user agent such as a browser to interpret them, so they would make a good core baseline. I agree with the HTML spec in not including 0-width characters within this list though, because doing so would actually break the accessible names of human readable strings that have these characters within them, at least in the way that the algorithm replaces characters in the above list by flattening all such characters into one string by replacing them into a single space character " ". E.G If you had a string such as If you were to add the 0-width character to this list and apply the same logic to it, and you encountered a string that was meant to be read as a single word but each character had a 0-width character between each character in the string, then all such letters would be separated by a single space character even though the word visually appeared as though it had no spacing between the characters. In this case the most accessible solution is to ignore all 0-width characters and replace them with nothing so that the computed name matches the content that is visually displayed. So, with the above list as a baseline whitespace character list, it can then be added to as needed by user agents when native host semantics require additions to be made within specific specs such as SVG, HTML, and CSS as needed within their respective algorithms. Does this make sense? All the best, |
Adding a note, I recently worked on a template linting rule - |
I took an action this morning to check with other WebKit engineers on the whitespace implementation and preferences. WebKit has several implementations differentiating "whitespace" in the contexts where the various specs disagree. For example:
So from an implementation perspective, it doesn't really matter which the spec uses... Ideally not another one though. There's a mild preference for HTML ASCII Whitespace unless there's a specific reason to use CSS Whitespace. Please double-check that advice during i18n review. |
@accdc wrote:
I don't think any of these characters should be listed in AccName. Instead of AccName hosting a copy of the HTML values, link across to the HTML Spec with prose indicating it's the definitive source. |
proposed for Sep 30 Deep Dive |
I think this was resolved via #165 and w3c/core-aam#128. Please re-open if there's something missing. |
This seems important to define because there's a lot of inconsistency between the whitespace definitions in different W3 specs.
HTML 5 uses two different definitions for white space:
https://www.w3.org/TR/html51/single-page.html#space-characters
HTML.1) White_space characters - defined as code points with the Unicode property "White_Space" in the Unicode PropList.txt data file
This definition is only used in the HTML spec to determine if table cells are empty. This definition includes non-ASCII spaces like non-breaking spaces (U+00A0) but excludes zero width spaces (U+200B and U+FEFF)
HTML.2) space characters
U+0020 SPACE, U+0009 CHARACTER TABULATION (tab), U+000A LINE FEED (LF), U+000C FORM FEED (FF), and U+000D CARRIAGE RETURN (CR)
This definition is used in lots of places in the HTML spec.
CSS has another two definitions for whitespace - both different to the HTML definitions:
CSS.1) White space: the 'white-space' property
U+0020 SPACE, U+0009 CHARACTER TABULATION (tab), U+000A LINE FEED (LF), and U+000D CARRIAGE RETURN (CR)
This is different from the HTML space characters definition - it doesn't include U+000C FORM FEED (FF)
https://www.w3.org/TR/CSS2/text.html#white-space-prop
https://drafts.csswg.org/css-text-3/#white-space-processing
CSS.2) The grammar for CSS files uses yet another definition of whitespace:
https://www.w3.org/TR/css-syntax-3/#whitespace
XML.1) XML uses another definition inherited by XML based formats like SVG and MathML
(#x20 | #x9 | #xD | #xA)+
but this looks equivalent to the CSS.1 definition
U+0020 SPACE, U+0009 CHARACTER TABULATION (tab), U+000A LINE FEED (LF), and U+000D CARRIAGE RETURN (CR)
https://www.w3.org/TR/xml/#NT-S
The text was updated successfully, but these errors were encountered: