-
-
Notifications
You must be signed in to change notification settings - Fork 631
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for W3C's CSS Speech Module #4242
Comments
Comment 2 by jteh on 2014-07-02 06:18 Whether we should even do this is somewhat controversial. A screen reader is a bit different to an interface designed specifically for speech. The intention is to represent all functionality available to a "screen" user, even if, in doing so, the speech might not be as "friendly" as one might expect from a specialised speech interface. Being able to tell a screen reader how numbers should be read or a name should be pronounced might be ideal, though even here, we would hit problems mapping this back to screen position, for example. However, we wouldn't want the content to be made entirely different. As to this specific case, generally, secondary content such as a tooltip is exposed separately from the primary content; e.g. as the "description" of the accessible element. For example, if you use the @title attribute on a link, the link content will be the link's name and the title will be its description. This way, the two types of content are separated and the screen reader can choose how to handle them. This can be done with ARIA attributes; e.g. aria-labelledby and aria-describedby. I feel this would be the more appropriate way to go here; i.e. expose them separately so that the AT decides how to handle them, rather than the library choosing a specific speech experience. The experience chosen by the library might be completely different from how a given screen reader normally reports tooltips. I'm leaving this open because it certainly needs further discussion, but it's very low priority at this stage. |
Comment 3 by mgifford on 2014-07-02 13:09 I have asked in FF https://bugzilla.mozilla.org/show_bug.cgi?id=47159 & Chrome https://code.google.com/p/chromium/issues/detail?id=369863&q=css3%20speech&colspec=ID%20Pri%20M%20Iteration%20ReleaseBlock%20Cr%20Status%20Owner%20Summary%20OS%20Modified But neither is supporting it yet http://css3test.com/ I am sure that any of these elements could be easily abused in a way that makes it less accessible. speak-as, pause, rest, cue all seem like they could be quite useful if done properly. But as with the title attribute, it's so easy to get it wrong. I've felt that it would be nice to use the voice-family consistently with say an admin theme or perhaps administration functions provided by the CMS. If there was support for this, it might provide the same aural cues that we have visually. Are there places where the pros/cons for this have been publically debated? But yes, on the specific issue of tooltips, my sense is that the @title attribute has been badly abused and confused with alt text in general. My assumption has been that most screen reader users simply ignore the title as it usually isn't useful. I don't know that there is a "normal" for tooltips. I'm assuming that these are still great examples Open Ajax Alliance & Dojo nightly http://www.w3.org/WAI/PF/aria-practices/#tooltip I'm assuming NVDA supports the role="tooltip" and it does really feel like a describedby type of event. Hopefully we can keep this conversation going a bit more. |
Comment 4 by jteh (in reply to comment 3) on 2014-07-03 22:50
It's certainly a tricky issue. On the surface, it does seem to make sense that if you can style something visually, you should be able to style it aurally. However, a visual user doesn't require an intermediary tool to present information to them in a primarily linear fashion, so it is a more direct mapping. One problem is that a screen reader might use certain voices for specific purposes, so if something else uses these, it might be very confusing.
Not that I know of.
That's not really my experience, especially on form fields and links.
Actually, NVDA doesn't really care about the tooltip role here. The key point is that aria-describedby references the tooltip, so the tooltip content becomes the "description" of the element in question. An NVDA user can then query this on demand and it is also reported when the element is focused, just as a sighted user would generally have to mouse over the element (or interact with it in some other way). |
@jcsteh's #4242 (comment) provides a series of seemingly compelling arguments about why this issue is extremely difficult to resolve, why it might be controversial to implement in the ffirst place, etc. Keeping that in mind, I would like to kindly invite developers to further the discussion of this support request for a module I don't believe too many NVDA users desire to work with in the first place, which requires significant code rewrites according to Jamie, and pose several other UX/technical challenges. On the surface at least, wontfix or P4 sounds justified. |
any chance to support "@media speech" at least? seems to be totally ignored by NVDA :/ |
I'd also like to keep this discussion warm, and argue against closing the issue just yet. Certainly, any rationale for not implementing CSS 3 speech support in screen readers is opaque and under-described, plus even though there may be strong arguments against such an implementation, there are also strong arguments in favour. The debate needs a proper and public airing, so that content developers can easily understand the reasoning. I've not found it easy to find relevant discussions on this subject. The w3c speech API has barely begun to get out there in the wild. I think the wisest course of action is to follow that rollout closely, and see whether it can somehow enrich the experience in NVDA and other screenreaders. If it still seems like a canard at that point, then by all means close. FWIW, I've already noticed web developers rushing ahead and implementing 'styled speech' in ways that conflict with WCAG recommendations. If I unilaterally get my website to voice its content (using the speech api or just extensive use of pre-recorded html5 audio), how will screenreaders handle the collision? It might be a rare thing today, but I expect it will be more common in the future as developers attempt to be WCAG compliant. At the very least, this particular issue should not be ignored. Back to CSS 3 speech: There is (I think) a compelling argument for mapping different semantics onto different 'kinds' of speech. There seems to be a use case for (say) aria-live regions to be distinguished from control labels, and each of those distinguished again from static text content. (etc.) More fine-grained or content-specific semantic differences are easy to imagine. When I say 'distinguish', I mean that it could be spoken in a different kind of voice (perhaps something as subtle as using the azimuth setting, or as radical as a different gender). One way this might be done could be to link particular aria roles to particular voice settings using css 3 speech properties. Another way might be to offer options to make such mappings in the screenreader preferences, though they are already very complex. I'd like to invite anyone interested to read this article, which breaks down audio into four 'typologies' (essentially, semantic categories). These categories might not be the best fit for general web content, but they could help to form a 'mental model' for how different audio characteristics could be used to denote different semantics. |
@sKopheK the Media Queries 4 spec makes it explicit that screen readers should match the 'screen' media type (and not 'speech') because they read the screen All the screen readers we tested (VoiceOver, JAWS, NVDA, WindowEyes, System Access and Dolphin) match @media screen and @media all, but not @media speech or @media aural |
Thanks for explanation. |
@sKopheK There is way in to provide a content: alternative in CSS - but I don't know how well supported it is https://www.w3.org/TR/css-content-3/#accessibility |
Our live region updates every couple of seconds, and our product is all about training rapid responses (for first aid). Urgency is an intentional part of the experience, but confusing the UI labels with the fictional accident is not. We just did some user tests, and can confirm that in our web-app, users find the babble of aria-live spoken in a contiguous stream alongside UI accessible names, announced in the exact same voice cripples usability. This was with aria-live="polite", by the way, which is supposed to be the least pushy beyond pure silence. I hoped for gaps, at least. We may have to abandon aria-live altogether, and roll-our-own 'live region', just to get a different voice. We really need to be able to distinguish semantics with different voice settings. Whether this be with CSS, distinguishing different aria-live 'channels', or some other mechanism. By all means, let it be up to the user what the details of those voice choices are, in much the same way as the user can choose font-family settings for 'serif', 'sans-serif', 'monospace' or 'fantasy' in the browser preferences. |
Just found this which states
|
@derekriemer, @jcsteh, @michaelDCurran, @feerrenrut your thoughts are very apreciated. |
Also @MarcoZehe and anyone from Microsoft as well. |
Yet another use-case: Being able to create something like Emacspeak for code, where semantic meaning of words is translated to a different pitch (e.g variable names sound different than class names). https://en.m.wikipedia.org/wiki/Emacspeak I personally think, if it's speaking hints specifically for screen readers, it's there to help screen reader users, in good intentions and probably not as an afterthought. |
Could this kind of support be implemented as an NVDA add-on? Emphasis could be done using different pitch, volume, delay etc. |
Just reading through this now, given I'm not familiar with the background of this hopefully I haven't totally misunderstood the point, apologies if so. The reasoning given on this issue in support seems mostly to allow web developers to have control over how differing semantics are presented to the user. I argue this is the wrong place to map the presentation of semantics. The likely outcome would be different websites providing conflicting or at least inconsistent presentations of semantics. This will only be more confusing for the user. It also ensures inconsistency with desktop applications. I strongly think this mapping should be done by the screen reader. Ideally it configurable by the user to account for any specific preferences or needs they may have. The experiment with aria-live is an interesting one, and likely something we could resolve within NVDA. I can imagine use-cases for entertainment type applications, eg ebooks, games, or similar. However, to reduce cognitive load, and meet the preferences and needs of the user, the screen reader should provide a consistent experience for consuming information and interacting with applications (web or otherwise). |
Cool, so what do you have in mind for this mapping that is done by the screen reader? |
Pardon my in-depth knowhow on CSS3 Speach API as I have started reading about it quite recently. But I think from web developers perspective implementing speak:none; from CSS instead of aria-hidden from HTML could be equivalent of text-heading-level:1; in CSS instead of h1 element in HTML. Looking at CSS It should be noted that semantics has been delivered by HTML to Assistive Technologies. So to expect Assistive Technologies to go and look for semantic information back in CSS, is kind of defying the purpose of seggrigating semantics from design. |
I think the major difference is that speak:none would presume that braille
could still occur, where aria-hidden hides both.
…On Wed, Jan 5, 2022 at 11:16 AM Siddhant Chothe ***@***.***> wrote:
Pardon my in-depth knowhow on CSS3 Speach API as I have started reading
about it quite recently. But I think from web developers perspective
implementing speak:none; from CSS instead of aria-hidden from HTML could be
equivalent of text-heading-level:1; in CSS instead of h1 element in HTML.
Looking at CSS It should be noted that semantics has been delivered by HTML
to Assistive Technologies. So to expect Assistive Technologies to go and
look for semantic information back in CSS, is kind of defying the purpose
of seggrigating semantics from design.
—
Reply to this email directly, view it on GitHub
<#4242 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABI2FPPVM7JD3PRIOHZGSILUUSDGJANCNFSM4D2X3PEA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
--
Derek Riemer:
Improving the world, one byte at a time. ⠊⠍⠏⠗⠕⠧⠬ ⠮ ⠸⠺⠂ ⠐⠕ ⠃⠽⠞⠑ ⠁⠞ ⠁ ⠐⠞⠲
Software engineer, Drive web
|
Adding older thread here https://sourceforge.net/p/nvda/lists/nvda-commits/thread/054.dc1ab85e62b6cf0bbf5f855dadb7a9eb%40nvaccess.org/ But also add https://css-tricks.com/lets-talk-speech-css/ and more timely https://www.meetup.com/css-cafe/events/291837233/ |
The newest specifications are documented here: As far as I understand, this speech module is not really only for screen readers, but especially also for using in industries and in situations where actively controlling a device is not appropriate i.e. while driving a car. Or when using the read aloud feature on websites like news papers etc, where people can click on a button to have an article read out loud. another use case is the feature of reading out loud a pdf in Edge or Adobe reader with their internal speech synthesizer. Anyway, if we put the control of the speech in the hands of a web author, this should definitely be an optional setting in the screen reader settings. |
Given there is no clear use case for a screenreader documented in this discussion, I am closing this for now. Please contribute with a concrete screen reader related use case and we can reopen. Or open a new discussion with screen reader use case in mind. |
Reported by mgifford on 2014-07-02 00:20
I'm trying to see if there is a way to improve the accessibility for http://kushagragour.in/lab/hint/
Which is now part of Drupal 8.
I'd like to see that there is support for http://www.w3.org/TR/css3-speech/
So that we could either insert a pause or change the voice family befor the tooltip is used.
Right now in VoiceOver it is all read together. In ChromeVox it gets ignored. However, there should be some means to convey that the tooltip is distinct aurally from the text it is describing.
This is probably a lot bigger than NVDA. Does NVDA support the CSS Speech Module?
The text was updated successfully, but these errors were encountered: