Read basic keyboard math symbols, math HTML entities, math character sets #3799

Closed
nvaccessAuto opened this Issue Jan 22, 2014 · 11 comments

1 participant

@nvaccessAuto

Reported by paulbohman on 2014-01-22 04:25
I would like NVDA to read basic math symbols, including those on the keyboard, HTML entities and common character sets in the default verbosity setting.

In the default configuration, NVDA will say "one one two" instead of "one plus one equals two" and will ignore other relatively common math symbols like greater than, less than, greater than or equal to, less than or equal to, and others.

It is true that some people will abuse these characters and use them to create horizontal lines or other ASCII art, but most of the time the characters mean what they're supposed to mean. Web developers and content writers need to know that basic math can be communicated to users, and I don't mean with MathML or anything fancy. I'm talking about just basic keyboard symbols, HTML entities, and math-related character sets.

See http://www.deque.com/dont-screen-readers-read-whats-screen-part-1-punctuation-typographic-symbols for a related blog post
Blocked by #5234
Blocking #5211

@nvaccessAuto

Comment 1 by paulbohman on 2014-01-22 04:55
Changes:
Changed title from "Read basic keyboard math symbols. math HTML entities, math characte sets" to "Read basic keyboard math symbols. math HTML entities, math character sets"

@nvaccessAuto

Comment 2 by jteh on 2014-01-22 05:08
For the most part, I agree with this, though we'll have to judge user reaction. However, I am concerned about the dash (-) symbol. That is often used to indicate a minus, but because it is also used as a hyphen, I don't think it should be spoken by default. I'd argue authors should be using the correct Unicode symbol anyway. Would you be happy with this?

@nvaccessAuto

Comment 3 by paulbohman on 2014-01-22 05:49
If we can have it read math symbols as math symbols, and not skip over them, that will help a lot.

I agree that the dash is problematic. Most people will type a dash instead of using a proper minus symbol, and that's pretty much never going to change. The dash symbol is too easy and the minus symbol is too hard. But wherever the real minus symbol (− or another official encoding, such as unicode 2212) is used I think it should definitely be read all the time, because if someone goes to the trouble of putting it in, they want it to be read. This may not be the most authoritative resource for character encoding, but it has some good info: http://www.fileformat.info/info/unicode/char/2212/index.htm

The truth is that I think the dash ought to be read nearly all the time, but if you don't want to do that, can we have the dash read out loud when there are numbers immediately before or after it? The dash might be part of a social security number, or telephone number, or license key, or something else, and often the dash is important in those situations. Another condition where the dash should probably be read out loud is when there are unpronounceable combinations of letters before or after it, such as in a license key or password. Example: JIK7D-23KDFK-SDFK89S-POZD

Distinguishing between "dash" and "minus" will be hard, I admit, but hearing "one dash one equals zero" is still better than hearing "one one zero." And if the encoding is a real minus symbol, then of course the screen reader should say "minus," not "dash." If it's possible to go a step further, the screen reader could try to interpret content as either math or text, based on some algorithms that look for patterns. If you do that, then you can have it say "dash" under some circumstances and "minus" under other circumstances. Those algorithms would be tricky, but not impossible, at least for basic arithmetic and algebraic expressions, which fit definite patterns.

Ideally, you'd take into account the keyboard punctuation, HTML entities, and unicode characters, at a minimum. There may be other character sets worth taking into account too. If it seems too daunting to explore all possible character sets, stick to just the three mentioned: keyboard, HTML and unicode. And then maybe branch out to other character sets later.

@nvaccessAuto

Comment 4 by briang1 on 2014-01-22 09:00
One can of course change the category now and even the word used for symbols of course. In Usenet certain characters are used at the start of every line of a quoted section. Not sure if its greater than or less than, but as long as the interpretation used could be defined in profiles, one could get around the annoyance of this issue.

One has also to remember tha the more processing you decide to do, the longer it will take!
As this also touches on what we call symbols, I'd suggest looking again at this for the default settings. Underscore says line which is a bit odd, And Exclamation, says bang etc, but I do understand other languages and cultures have different words etc.
Also of course doing more processing here will mean right to left routines get a whole lot more complex!

@nvaccessAuto

Comment 5 by paulbohman on 2014-01-22 12:54
Changes:
Changed title from "Read basic keyboard math symbols. math HTML entities, math character sets" to "Read basic keyboard math symbols, math HTML entities, math character sets"

@nvaccessAuto

Comment 6 by Michael Curran <mick@... on 2015-05-07 02:38
In [c584102]:
```CommitTicketReference repository="" revision="c5841029a6db433abf4fc2de25182d1d22e21a35"
Merge branch 't3799' into next. Incubates #3799

Changes:
Added labels: incubating
@nvaccessAuto

Comment 7 by mdcurran on 2015-05-07 02:41
Changesset above moves several symbols from most to some including #<>±×÷
Also () and – are now preserved to ensure that synths will pause/inflect appropriately if they support these characters.

  • and = etc were already moved from most to some in 2014.1.
@nvaccessAuto

Comment 8 by mdcurran on 2015-05-07 02:42
And other unicode math symbols were added in #3805

@nvaccessAuto

Comment 9 by leonarddr (in reply to comment 7) on 2015-05-08 12:48
Replying to mdcurran:

Also () and – are now preserved to ensure that synths will pause/inflect appropriately if they support these characters.

Ugh, am I the only one who really has to get used to this change? It has advantages in text reading, but certainly disadvantages when moving through guis.
I'd personally revert this change manually, but this requires setting preserve to never in the symbols file in appdata. I'd suggest reverting this until we have the possibility to change the preserve value in the NVDA gui.

@nvaccessAuto

Comment 11 by jteh (in reply to comment 9) on 2015-07-20 06:20
Replying to leonarddr:

I'd personally revert this change manually, but this requires setting preserve to never in the symbols file in appdata. I'd suggest reverting this until we have the possibility to change the preserve value in the NVDA gui.

Filed #5234 to get this into the GUI.
Changes:
Milestone changed from None to 2015.4

@nvaccessAuto

Comment 12 by James Teh <jamie@... on 2015-10-19 03:53
In commit f366923:
English symbols: move several more symbols from most to some including #<>±×÷ and always preserve ()– to ensure synths pause/inflect properly for these characters if supported.

Fixes #3799.
Changes:
Removed labels: incubating
State: closed

@nvaccessAuto nvaccessAuto added this to the 2015.4 milestone Nov 10, 2015
@jcsteh jcsteh added a commit that referenced this issue Nov 23, 2015
@jcsteh jcsteh English symbols: move several more symbols from most to some includin…
…g #<>±×÷ and always preserve ()– to ensure synths pause/inflect properly for these characters if supported.

Fixes #3799.
f366923
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment