Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pdf: NVDA skips private use areas characters #5562

Open
surfer0627 opened this issue Dec 1, 2015 · 15 comments
Open

pdf: NVDA skips private use areas characters #5562

surfer0627 opened this issue Dec 1, 2015 · 15 comments

Comments

@surfer0627
Copy link
Contributor

surfer0627 commented Dec 1, 2015

NVDA skips some symbols: the numeric values are (e18c, e18d, e18e, and e18f)
This case occurs in NVDA 2014.1 and later version.
Please use the attachments to test it.
symbols.docx
symbols.pdf

case1:
environment:
NVDA2015.4 installed
interface language: English
Synthesizer: eSpeak
Adobe Reader: XI

STR:

  1. Open file "symbols.docx"
  2. Press down arrow, NVDA reports "symbol2 b"
  3. Open file "symbols.pdf"
  4. Press down arrow, NVDA reports "symbol2"
  5. Press right arrow several times to move to "colon:"
  6. Press right arrow twice, NVDA reports space
    (NVDA skips a symbol e18d.)

notes: In NVDA2012.2, the symbols could be detected.

case2:
environment:
NVDA2012.2 portable
interface language: English
Synthesizer: eSpeak
Adobe Reader: XI

  1. Open file "symbols.pdf"
  2. Press right arrow several times to move to "colon:"
  3. Press right arrow twice, NVDA reports nothing.
    (This is because here is a symbol e18c)

notes:
• I could not find NVDA2013.x, so I'm not test it.
• NVDA_2012.2 (portable) could be downloaded at
https://dl.dropboxusercontent.com/u/90288447/nvda2012.2.1.rar

@surfer0627
Copy link
Contributor Author

Sorry, I don't know how to attach files.
I select them and submit.
But, I could not see any file link in this page.

@surfer0627
Copy link
Contributor Author

According to the investigation from users, now, NVDA reads pdf files and skips "private use areas" characters.

notes:
In Unicode, a Private Use Area (PUA) is a range of code points that, by definition, will not be assigned characters by the Unicode Consortium.

definition resources:
https://en.wikipedia.org/wiki/Private_Use_Areas

@jcsteh
Copy link
Contributor

jcsteh commented Dec 7, 2015

Because this is the private use area, it is by definition impossible to have standard mappings for them. Therefore, there's nothing useful we can do here.

@jcsteh jcsteh closed this as completed Dec 7, 2015
@surfer0627
Copy link
Contributor Author

Is it possible that NVDA could detect the character like NVDA2012.2 do?

If users know that there is a private use area character. They could find one sighted person for help.

@jcsteh
Copy link
Contributor

jcsteh commented Dec 7, 2015

Sorry. I misunderstood what you were asking for. I thought you were expecting these symbols to have proper names, which is impossible. However, the fact that they aren't present at all recently is another issue entirely.

@jcsteh jcsteh reopened this Dec 7, 2015
@jcsteh
Copy link
Contributor

jcsteh commented Dec 7, 2015

I can confirm this. Change was introduced by 788cefb (#2963, NVDA 2014.1).

@michaelDCurran: This commit does two things:

  1. VBufBase's nodeHasUsefulContent: rather than calling isWhitespace, write out a for loop directly, and return true if any character that is not whitespace (iswspace) or is from the private use range or 0-width space (isPrivatecharacter) is found.
  2. VbufStorage_buffer_t::addTextFieldNode: strip private characters from the start and end of the text string if they exist when giving the text to the new text node.
  1. is the issue here, as it filters out text nodes which only contain a private Unicode character (which is unfortunately how things tend to get rendered in PDF). The question is: why do we need 2)? 1) should cause browsers to fall back to the label as required because nodeHasUsefulContent will return false. Were you just trying to get rid of pointless nodes or can you remember whether there was some other reason for this?

@michaelDCurran
Copy link
Member

michaelDCurran commented Dec 7, 2015

I guess it was that when we fall back to a label, it is appended, rather
than replacing the content. for example, a button with a private use
char would then get rendered as the private use char + the label. At the
time that looked funny.

However, if it breaks something, then there is no technical reason I can
think of why it needs to be removed.

@surfer0627

This comment has been minimized.

@surfer0627

This comment has been minimized.

@surfer0627 surfer0627 changed the title pdf: NVDA skips some symbols, the numeric values are (e18c through e18f) pdf: NVDA skips private use areas characters Jun 3, 2016
@bhavyashah
Copy link

@jcsteh #5562 (comment) suggests that you are able to successfully reproduce the reported issue and are aware of the causative factors of this regression. Could you and @surfer0627 please check if this bug still stands in the latest version of Acrobat Reader?

@surfer0627
Copy link
Contributor Author

@bhavyashah:
I could still reproduce this in Adobe Acrobat Reader DC 17.012.20093 - Chinese Traditional.

@Adriani90
Copy link
Collaborator

I can still reproduce this issue in NVDA alpha-16768,a6f7fb40 with Adobe reader 19.010.20091

@surfer0627
Copy link
Contributor Author

Now, NVDA 2019.3.1 released.

Then, we still need to use version 2013.3 to read private use area characters in pdf while using acrobat reader.

Is it possible to have a try build to fix this issue temporarily?

Actually, I do not know how to do.

(git revert 788cefb) or something else.

Thank you for all of your help.

@feerrenrut
Copy link
Contributor

This is more complicated than just reverting the change. It's hard to say whether this is a regression or not, since this was initially changed for #2963. I found the description of this issue hard to follow, I'll attempt to describe it in my own words:

While reading a PDF and encountering a "private use character" without a label, something is reported to be able to detect the characters presence so that the user can ask for help.

However it seems that if we fixed this in the way suggested by jcsteh's comment we will end up with noise being added to cases where a label exists. Ideally there is a label that should replace these characters, and they don't have to be rendered.

@Adriani90
Copy link
Collaborator

Suggestion: use speech refactor feature to add a beep or a short sound that indicates a PUA symbol. However, this should apply only in the PDF virtual document, in MS Word PUA bullets for example are mapped to unicode, so NVDA would not report these anymore if we change the behavior for Microsoft word as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants