pdf: NVDA skips private use areas characters #5562

surfer0627 · 2015-12-01T06:55:57Z

NVDA skips some symbols: the numeric values are (e18c, e18d, e18e, and e18f)
This case occurs in NVDA 2014.1 and later version.
Please use the attachments to test it.
symbols.docx
symbols.pdf

case1:
environment:
NVDA2015.4 installed
interface language: English
Synthesizer: eSpeak
Adobe Reader: XI

STR:

Open file "symbols.docx"
Press down arrow, NVDA reports "symbol2 b"
Open file "symbols.pdf"
Press down arrow, NVDA reports "symbol2"
Press right arrow several times to move to "colon:"
Press right arrow twice, NVDA reports space
(NVDA skips a symbol e18d.)

notes: In NVDA2012.2, the symbols could be detected.

case2:
environment:
NVDA2012.2 portable
interface language: English
Synthesizer: eSpeak
Adobe Reader: XI

Open file "symbols.pdf"
Press right arrow several times to move to "colon:"
Press right arrow twice, NVDA reports nothing.
(This is because here is a symbol e18c)

notes:
• I could not find NVDA2013.x, so I'm not test it.
• NVDA_2012.2 (portable) could be downloaded at
https://dl.dropboxusercontent.com/u/90288447/nvda2012.2.1.rar

surfer0627 · 2015-12-01T07:10:56Z

Sorry, I don't know how to attach files.
I select them and submit.
But, I could not see any file link in this page.

surfer0627 · 2015-12-07T01:52:48Z

According to the investigation from users, now, NVDA reads pdf files and skips "private use areas" characters.

notes:
In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the Unicode Consortium.

definition resources:
https://en.wikipedia.org/wiki/Private_Use_Areas

jcsteh · 2015-12-07T02:15:52Z

Because this is the private use area, it is by definition impossible to have standard mappings for them. Therefore, there's nothing useful we can do here.

surfer0627 · 2015-12-07T04:03:13Z

Is it possible that NVDA could detect the character like NVDA2012.2 do?

If users know that there is a private use area character. They could find one sighted person for help.

jcsteh · 2015-12-07T04:33:45Z

Sorry. I misunderstood what you were asking for. I thought you were expecting these symbols to have proper names, which is impossible. However, the fact that they aren't present at all recently is another issue entirely.

jcsteh · 2015-12-07T04:54:27Z

I can confirm this. Change was introduced by 788cefb (#2963, NVDA 2014.1).

@michaelDCurran: This commit does two things:

VBufBase's nodeHasUsefulContent: rather than calling isWhitespace, write out a for loop directly, and return true if any character that is not whitespace (iswspace) or is from the private use range or 0-width space (isPrivatecharacter) is found.

VbufStorage_buffer_t::addTextFieldNode: strip private characters from the start and end of the text string if they exist when giving the text to the new text node.

is the issue here, as it filters out text nodes which only contain a private Unicode character (which is unfortunately how things tend to get rendered in PDF). The question is: why do we need 2)? 1) should cause browsers to fall back to the label as required because nodeHasUsefulContent will return false. Were you just trying to get rid of pointless nodes or can you remember whether there was some other reason for this?

michaelDCurran · 2015-12-07T05:38:32Z

I guess it was that when we fall back to a label, it is appended, rather
than replacing the content. for example, a button with a private use
char would then get rendered as the private use char + the label. At the
time that looked funny.

However, if it breaks something, then there is no technical reason I can
think of why it needs to be removed.

bhavyashah · 2017-08-05T15:50:03Z

@jcsteh #5562 (comment) suggests that you are able to successfully reproduce the reported issue and are aware of the causative factors of this regression. Could you and @surfer0627 please check if this bug still stands in the latest version of Acrobat Reader?

surfer0627 · 2017-09-01T03:36:25Z

@bhavyashah:
I could still reproduce this in Adobe Acrobat Reader DC 17.012.20093 - Chinese Traditional.

Adriani90 · 2019-02-18T22:55:41Z

I can still reproduce this issue in NVDA alpha-16768,a6f7fb40 with Adobe reader 19.010.20091

surfer0627 · 2020-02-17T07:18:13Z

Now, NVDA 2019.3.1 released.

Then, we still need to use version 2013.3 to read private use area characters in pdf while using acrobat reader.

Is it possible to have a try build to fix this issue temporarily?

Actually, I do not know how to do.

(git revert 788cefb) or something else.

Thank you for all of your help.

feerrenrut · 2020-03-04T14:01:24Z

This is more complicated than just reverting the change. It's hard to say whether this is a regression or not, since this was initially changed for #2963. I found the description of this issue hard to follow, I'll attempt to describe it in my own words:

While reading a PDF and encountering a "private use character" without a label, something is reported to be able to detect the characters presence so that the user can ask for help.

However it seems that if we fixed this in the way suggested by jcsteh's comment we will end up with noise being added to cases where a label exists. Ideally there is a label that should replace these characters, and they don't have to be rendered.

Adriani90 · 2023-04-01T21:44:57Z

Suggestion: use speech refactor feature to add a beep or a short sound that indicates a PUA symbol. However, this should apply only in the PDF virtual document, in MS Word PUA bullets for example are mapped to unicode, so NVDA would not report these anymore if we change the behavior for Microsoft word as well.

jcsteh added the close/wontfix label Dec 7, 2015

jcsteh closed this as completed Dec 7, 2015

jcsteh removed the close/wontfix label Dec 7, 2015

jcsteh reopened this Dec 7, 2015

This comment has been minimized.

Sign in to view

surfer0627 changed the title ~~pdf: NVDA skips some symbols, the numeric values are (e18c through e18f)~~ pdf: NVDA skips private use areas characters Jun 3, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pdf: NVDA skips private use areas characters #5562

pdf: NVDA skips private use areas characters #5562

surfer0627 commented Dec 1, 2015 •

edited by feerrenrut

surfer0627 commented Dec 1, 2015

surfer0627 commented Dec 7, 2015

jcsteh commented Dec 7, 2015

surfer0627 commented Dec 7, 2015

jcsteh commented Dec 7, 2015

jcsteh commented Dec 7, 2015

michaelDCurran commented Dec 7, 2015 •

edited by feerrenrut

This comment has been minimized.

This comment has been minimized.

bhavyashah commented Aug 5, 2017

surfer0627 commented Sep 1, 2017

Adriani90 commented Feb 18, 2019

surfer0627 commented Feb 17, 2020

feerrenrut commented Mar 4, 2020

Adriani90 commented Apr 1, 2023

pdf: NVDA skips private use areas characters #5562

pdf: NVDA skips private use areas characters #5562

Comments

surfer0627 commented Dec 1, 2015 • edited by feerrenrut

surfer0627 commented Dec 1, 2015

surfer0627 commented Dec 7, 2015

jcsteh commented Dec 7, 2015

surfer0627 commented Dec 7, 2015

jcsteh commented Dec 7, 2015

jcsteh commented Dec 7, 2015

michaelDCurran commented Dec 7, 2015 • edited by feerrenrut

This comment has been minimized.

This comment has been minimized.

bhavyashah commented Aug 5, 2017

surfer0627 commented Sep 1, 2017

Adriani90 commented Feb 18, 2019

surfer0627 commented Feb 17, 2020

feerrenrut commented Mar 4, 2020

Adriani90 commented Apr 1, 2023

surfer0627 commented Dec 1, 2015 •

edited by feerrenrut

michaelDCurran commented Dec 7, 2015 •

edited by feerrenrut