Semantic support (not just style support) for del and ins on web pages #4920

Closed
nvaccessAuto opened this Issue Feb 13, 2015 · 16 comments

2 participants

@nvaccessAuto

Reported by paulbohman on 2015-02-13 22:50
I know that it's possible to have NVDA read the style attributes of strikethrough and underlined text (preferences > document formatting > report font attributes), which is how and are styled by default in browsers, but NVDA is not conveying the proper semantics this way. I would like NVDA to treat and semantically, rather than just stylistically, and I'd like this meaning to be conveyed by default, not just by turning on a setting in the preferences.

There are a few problems with relying on styles to convey meaning, as NVDA is doing here:

  1. NVDA doesn't say "deleted" or "inserted" as it should. It just says "strikethrough" and "underlined" which isn't the same thing. There may be multiple instances of underlined text on the same page, and some of it may not be inserted text. It could be a heading or even a regular link, for example. NVDA will say "underlined" in all these cases. Users can't tell the difference between underlined text and inserted text. The style is not guaranteed to convey that particular meaning.

  2. Web designers may choose to override the default styles of the browser using CSS. The designer could choose to make deleted text grayed out, for example, and could choose to make inserted text red. I'm not saying that's a good idea. I'm just saying that if you're relying on styles alone, it's possible someone could choose different styles, which ruins NVDA's ability to derive any meaning at all.

  3. The fact that the option is currently not turned on by default means that web developers can't rely on del and ins to convey any meaning, even though both of them are semantic elements that in fact convey a lot of meaning, and it's important meaning, especially in legal documents and such.

Here is a test file: https://dequeuniversity.com/testsuite/text/strikethrough

Hoped-for result: The screen reader will say "begin deleted text," "end deleted text," "begin inserted text," "end inserted text" (or similar).

Actual result: NVDA dos not announce anything at all in default mode. With "report font attributes" turned on it does announce the style features, but does not announce any of the semantic meaning, especially when the styles are altered by the web developer.

Recommendation: Read the tags themselves, rather than the styles, announce the meaning (deleted and inserted), and turn on this meaning by default.

@nvaccessAuto

Comment 1 by jteh on 2015-02-13 23:11
While I normally push for semantics over style, I've always found elements like this to be tricky. Strong and em, for example, don't really mean anything to most people, even though they have more semantic meaning than bold or italic. That said, I think ins and del would mean more to most users semantically speaking.

We'll need to discuss this with browser vendors. At present, Firefox doesn't expose this via a11y APIs at all. I don't think it's even possible to expose this via IAccessible2, so that'll require a spec change. For IE, we can hackishly parse the DOM as usual.

I think we could map this to NVDA's "Report editor revisions" setting. It's not a precise match, but IMO, it's less confusing than introducing a new setting.

@nvaccessAuto

Comment 2 by paulbohman on 2015-02-13 23:27
First of all, thank you for the prompt responses. I've always been impressed with your attentiveness.

When I looked through the options, I actually did expect del and ins to be activated by the "report editor revisions" setting, so I agree that's better than where it is now.

But that alone wouldn't be much of an improvement if the screen reader isn't reading the underlying semantics. I'm really surprised to hear that Firefox doesn't expose del or ins through the accessibility API. I'll see if I can file a report with Mozilla.

But I really do think that these semantic tags need to be read by screen readers by default. You mentioned strong and em, which I've been waiting for screen readers to support correctly by default since 1999 when I started in the accessibility field! Are those not conveyed through the accessibility API either?

@nvaccessAuto

Comment 3 by paulbohman on 2015-02-13 23:42
Here's a list of some of the semantic tags I'd like to see supported and read as actual semantic tags, and not just as styles:

<strong> — strong emphasis
<em> — emphasis
<q> — inline quotation
<pre> — preformatted text
<code> — computer code (such as HTML, JavaScript, PHP, etc.)
<address> — contact information for the creators of a document
<time> — date and/or time
<del> - deleted text
<ins> - inserted text

If a screen reader actually said "strong emphasis" by default, or just regular "emphasis" with the strong and em tags respectively, I think that would be a huge step forward for semantic meaning, actually, so I'm not sure why you say they don't really mean anything to most people. I'm a sighted user, and I pay a lot of attention to bold and italic text because it almost always conveys real meaning. Even a simple phrase like "I'm happy today" can convey three different meanings depending on which word you emphasize. If I emphasize "I'm" that means that I'm distinguishing myself from someone else who may be associated with the conversation. If I emphasize "happy" then I'm emphasizing the emotion itself, to distinguish between other emotions like sadness or confusion etc. If I emphasize "today" then I'm distinguishing between yesterday and tomorrow or some other time period.

I suppose the ideal for strong and em would be to have the voice synthesizer actually emphasize it the way a human would with the voice: say the word slower and at a higher pitch, often with a complex inflection that I admit would be difficult to program. But if it can't be done with the voice, it could either say "emphasis" or it could use a click or a beep with an up sound at the start, then a click or beep with a down sound at the end of the emphasized section. Maybe it could be a single beep for emphasized text, and a double beep for strongly emphasized text.

Mainly, though, whatever method is used, I would want it to be turned on by default for all of these types of semantic tags, because otherwise meaning is lost.

I would personally put a higher priority on strong, em, del, ins, and code than the others, but the others are meaningful too.

@nvaccessAuto

Comment 4 by jteh (in reply to comment 3) on 2015-02-14 00:29
Replying to paulbohman:

<pre> — preformatted text

<code> — computer code (such as HTML, JavaScript, PHP, etc.)

I don't follow why the semantic meaning of these two is important for a screen reader user. Certainly, saying "pre-formatted text" is just not going to make sense to the average user, especially since it can be used for all sorts of things. There's perhaps a stronger argument for code, but there, verbosity would start to get annoying. Imagine reading a paragraph in a technical reference manual.

<address> — contact information for the creators of a document

<time> — date and/or time

Nor these. The semantics do mean something and I'm not saying the tags aren't useful, but to a screen reader user, the fact that something is a date or an address is obvious. Saying "(date) 14 February 2015 (out of date)" is just pointless verbosity, just as saying "(paragraph) This is a paragrph of text. (out of paragraph)" is pointless.

If a screen reader actually said "strong emphasis" by default, or just regular "emphasis" with the strong and em tags respectively, I think that would be a huge step forward for semantic meaning, actually, so I'm not sure why you say they don't really mean anything to most people.

I'm arguing that "bold" actually makes more sense to most users than "strong emphasis". Even though it's a visual thing, the idea of "bold" has become very synonymous with emphasis.

Mainly, though, whatever method is used, I would want it to be turned on by default for all of these types of semantic tags, because otherwise meaning is lost.

The problem is that you can argue that for just about everything. Too much verbosity is a problem for screen reader users because speech already has efficiency problems as it is. I can almost guarantee you that if we enabled emphasis reporting by default, we would have requests from users to revert it. Ins and del are different because the text possibly won't make any sense at all without them. With emphasis, there is additional meaning, but it isn't impossible to interpret the text without it.

@nvaccessAuto

Comment 5 by jteh on 2015-02-14 00:30
If you file a bug with Mozilla, please cc me. (Just type :jamie in the cc field.)

@nvaccessAuto

Comment 6 by paulbohman on 2015-02-14 00:44
Concerning the code tag: I write lots of web tutorials about web accessibility, and some of them include code examples. Not all of my readers are technical enough to understand the code examples, and it can be unclear sometimes when a code snippet begins and ends if it's just inline with a regular sentence. It depends on what is actually written in the code snippet, of course, but I want to make it clear what is code and what isn't. That's why I think it's useful to have that one spoken in some way.

I'm fine if you decide that "bold" makes more sense than "strong emphasis."

I agree that pre, address, and time are less useful to know about. I think I would be fine if those were an option that was off by default. They could be useful to your pronunciation rules though, if you come across something marked as , you could make sure the screen reader reads it the way a human would pronounce the time, and not say something like "four colon zero zero."

Overall I'm not saying that literally everything ought to be turned on by default, but I am saying that semantic tags were invented primarily for screen readers and computer parsing, and it's a shame that we're not fully taking advantage of them. Strong, em, del, and ins would top my list, though I would also consider code important. The others could be optional, probably. Oh, and and . I consider those quite important.

Maybe the ideal would be to create a new category of setting in the preferences for NVDA for "semantic markup" and then give users checkboxes for each of the tags mentioned above (and possibly others that I may be forgetting). The most important ones (which in my opinion would be strong, em, del, ins, sup, sub, code) would be checked by default. The others would be unchecked by default.

If you search through threads over the years on the WebAIM list or W3C list, you'll see lots of people continually asking if strong and em are supported yet, along with superscript, strikethrough, and other semantic tags. You'll read quite a bit of frustration in their posts as they hear "not yet" over and over.

@nvaccessAuto

Comment 7 by paulbohman on 2015-02-14 01:00
It looks like you're already CC'd on the relevant bug for Mozilla, filed August 8, 2013: https://bugzilla.mozilla.org/show_bug.cgi?id=903187

I added a comment to it.

@nvaccessAuto

Comment 8 by jteh (in reply to comment 6) on 2015-02-14 01:35
Replying to paulbohman:

Concerning the code tag: I write lots of web tutorials about web accessibility, and some of them include code examples. Not all of my readers are technical enough to understand the code examples, and it can be unclear sometimes when a code snippet begins and ends if it's just inline with a regular sentence. It depends on what is actually written in the code snippet, of course, but I want to make it clear what is code and what isn't. That's why I think it's useful to have that one spoken in some way.

Again, please file a bug with Mozilla. Aside from the fact that we'll need them to implement it if we want this, it probably wouldn't hurt to have this discussion there.

I'm fine if you decide that "bold" makes more sense than "strong emphasis."

I guess one advantage of handling emphasis separately is that it means we won't get bold/italic reporting for headings, etc. where the semantic info is actually the heading, not the way it's presented.

I agree that pre, address, and time are less useful to know about. I think I would be fine if those were an option that was off by default.

IMO, they shouldn't be an option period. Still, let's spin that off into a separate discussion, as this point is more subjective.

Oh, and and . I consider those quite important.

Agreed.

If you search through threads over the years on the WebAIM list or W3C list, you'll see lots of people continually asking if strong and em are supported yet, along with superscript, strikethrough, and other semantic tags. You'll read quite a bit of frustration in their posts as they hear "not yet" over and over.

It's worth noting that we do support strike, super and sub. We just don't report them by default. Also, while you make valid points, the reality is that we must always consider the concerns of our users over those of authors. If users find that it causes excessive verbosity, that is reason enough for this not to be a default.

@nvaccessAuto

Comment 9 by jteh (in reply to comment 7) on 2015-02-14 01:40
Replying to paulbohman:

It looks like you're already CC'd on the relevant bug for Mozilla, filed August 8, 2013: https://bugzilla.mozilla.org/show_bug.cgi?id=903187

Hmm. It looks like I even suggested how it could be implemented in IAccessible2. I totally don't remember that; it all starts to blur after a while. :)

Btw, with respect, if the web community desperately wants all of this stuff to be supported, it'd be great if people could get together to help to make it happen, whether by submitting patches or contributing financially.

@nvaccessAuto

Comment 10 by paulbohman on 2015-02-14 02:47
Concerning donations: That's a fair point. I just submitted a donation. (I've donated in the past too.) I also use NVDA on a regular basis in my training workshops with developers, so I'm increasing your user base all the time, mostly among developers and QA testers. Most are not blind, but they are still a valid part of your user base. As far as actually submitting code contributions, there are far better programmers than myself, so I have to leave that to others.

Concerning the separation of bold and emphasis on headings etc: yes, I would like to see that separation. Knowing that a heading is bold isn't very useful, because its status as a heading already gives it prominence. In fact, someone may even decide to emphasize a word or a phrase within a heading:

This is very important

. That's a trivial example, of course, but you get the idea. If we don't separate styles from semantics, that meaning gets lost.

On supporting semantic tags by default: As we've discussed, some are more important than others. I've seen and used to convey prices on e-commerce sites. Old price versus new price. That's an important distinction. I've seen strong and em all over the place. I get the feeling that they are usually used correctly, but I know there are some sites that abuse them. With all the clients we have, though, that doesn't seem to be a common form of code abuse. (There are plenty of other abuses.) The tag is extremely common on technical sites, but almost non-existent elsewhere. Within the technical world, though, it's pretty important. Superscript and subscript blur into the regular text if not read correctly, and meaning definitely gets lost that way.

I think a lot of this is a chicken and egg question though. If screen readers support these features really well, then accessibility professionals like myself can enthusiastically endorse the correct use of the tags. We can finally say: Yes! If you use the semantic tags, screen readers will convey your message just as you intended it! That's kind of an important message.

@nvaccessAuto

Comment 11 by jteh (in reply to comment 10) on 2015-02-14 03:15
Replying to paulbohman:

Concerning donations: That's a fair point. I just submitted a donation. (I've donated in the past too.) ...

Thank you. We do appreciate that. As I said, I wasn't directing this at anyone in particular; it was a general observation.

Within the technical world, though, is pretty important.

I'd argue that within the technical world, it's usually clear from context when something is code. Certainly, as a coder myself, it would drive me utterly crazy to hear "code" everywhere, since there could be several instances per line.

I think a lot of this is a chicken and egg question though. If screen readers support these features really well, then accessibility professionals like myself can enthusiastically endorse the correct use of the tags.

True, though then we get "but screen reader x is more popular (and costs a lot) and it doesn't support it, so we still can't recommend it". :(

@nvaccessAuto

Comment 12 by paulbohman on 2015-02-14 03:23
Believe me, I know all about screen reader x comparisons :)

If it helps, I can tell you that we work with a lot of high profile companies, and several of them now use NVDA as their primary testing screen reader, and use it as their benchmark. Some only test in NVDA. Others test in a variety of screen readers. NVDA is in fact the benchmark at my own company. And since our whole profession is web accessibility, that says a lot, I think. But of course we use other screen readers too, and I'm submitting these same feature requests to screen readers x, y, and z!

You might be right about the code tag, but in the interest of promoting good semantic markup, I still hope it can be among the tags that are supported, even if it's turned off by default.

@nvaccessAuto

Comment 13 by mdcurran on 2015-05-07 04:26
We'll start looking at implementing some of these in our Internet Explorer support. FF will have to wait until the Mozilla bug is fixed.

@nvaccessAuto

Comment 14 by Michael Curran <mick@... on 2015-09-21 04:38
In commit 4982433:
Merge branch 't4920' into next. Incubates #4920
Changes:
Added labels: incubating

@nvaccessAuto

Comment 15 by Michael Curran <mick@... on 2015-10-06 05:26
In commit 27fab9c:
Merge branch 't4920'. Fixes #4920
Changes:
Removed labels: incubating
State: closed

@jcsteh

Having emphasis reported by default has been extremely unpopular with users and resulted in a lot of complaints about NVDA 2015.4. The unfortunate reality is that emphasis is very much over-used in the wild. I had serious misgivings that this would be the result when we implemented this and it seems these unfortunately turned out to be quite warranted. As such, we've now disabled this by default, though the option is still there for those that want it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment