Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVDA not line breaking text in PDFs #7275

Open
Neurrone opened this issue Jun 12, 2017 · 17 comments
Open

NVDA not line breaking text in PDFs #7275

Neurrone opened this issue Jun 12, 2017 · 17 comments
Labels
p4 https://github.com/nvaccess/nvda/blob/master/projectDocs/issues/triage.md#priority

Comments

@Neurrone
Copy link

When browsing PDFs in jaws vs NVDA in adobe reader, NVDA doesn't break across lines (e.g, especially across source code), while jaws breaks it appropriately.

@derekriemer
Copy link
Collaborator

derekriemer commented Jun 12, 2017 via email

@feerrenrut
Copy link
Contributor

@Neurrone In particular it would be handy if you could provide a PDF which demonstrates this issue. Thanks for bringing this to our attention!

@jcsteh
Copy link
Contributor

jcsteh commented Jun 13, 2017

PDF has semantic tags for paragraphs, lists, tables and the like. However, it does not differentiate author inserted line breaks (as in source code or poetry, sometimes known as hard line breaks) from line breaks used to wrap text which cannot fit on a single line (sometimes known as soft line breaks). Because NVDA splits text into lines itself (according to the "Maximum number of characters on one line" Browse Mode setting), we strip line break characters, as otherwise, you end up with a lot of long lines followed by short lines (as I recall happened in JAWS when I used it years ago). Having spoken to someone involved in PDF accessibility specification writing, my understanding is that the correct way to author such content is to tag each line as a separate list item or paragraph. Unfortunately, it seems no one actually does this in the wild.

I think the only way we could reasonably solve this is to ignore NVDA's own settings for splitting lines and instead use only the line breaks in the PDF. That would also require us to not treat line breaks as paragraphs for PDF. This would be somewhat inconsistent with browse mode everywhere else, but I think consistency is probably outweighed by usability here.

@jcsteh jcsteh added the p4 https://github.com/nvaccess/nvda/blob/master/projectDocs/issues/triage.md#priority label Jun 13, 2017
@Neurrone
Copy link
Author

@feerrenrut the pdfs i've been dealing with have copyright on them, so i wasn't sure how i could legally distribute a sample. Wanted to put it up here first to see if anyone knew what was happening before I cut a page out of the pdf or something.

@feerrenrut
Copy link
Contributor

That's ok, thanks @Neurrone. If you happen to come across one that you can safely distribute, please do attach it. This will make life easier for the person who works on this, since they can get started straight away without first having to find a test case.

@Neurrone
Copy link
Author

Neurrone commented Jun 16, 2017

@derekriemer @feerrenrut I've found a free PDF I can link to to demonstrate.

This is a sample from the book Haskell Programming from First Principles.

On page 3 (press ctrl+shift+n and type 3 to directly go to it), this is the output in NVDA for the source code near the middle of the page.

Now try entering some simple arithmetic at your prompt: Prelude> 2 + 2 4Prelude> 7 < 9 True 
Prelude> 10 ^ 2 100

But the output in jaws is

Now try entering some simple arithmetic at your prompt: 
Prelude> 2 + 2 
4
Prelude> 7 < 9 
True 
Prelude> 10 ^ 2 
100

The latter is obviously much easier to read, especially when code gets more complex. This also happens in many other places, e.g section titles.

@Neurrone
Copy link
Author

Sidenote, how are the priority labels chosen? Feel that this should be medium severity at least, given how common PDFs are, and fixing would be a big usability win.

@jcsteh
Copy link
Contributor

jcsteh commented Jun 16, 2017 via email

@Neurrone
Copy link
Author

@jcsteh noted. There's probably no chance that tools being used in the wild will get fixed though (PdfLaTeX is one of the worst offenders).

@Brian1Gaff
Copy link

Brian1Gaff commented Jun 17, 2017 via email

@Neurrone
Copy link
Author

@Brian1Gaff it would be nice if the tools being used to generate PDFs got updated so they are standards compliant, but I'm not holding my breath It might happen for e.g, word but I don't think for PdfLaTeX.

Even if they got fixed, we still have to deal with existing PDFs, so we'd still have to support them.

@dan1982code
Copy link

I have a simple example attached here. It would be great if this could be fixed.
test.pdf

@Cleversn
Copy link

Cleversn commented May 8, 2019

RD-800_PA.pdf
Here's another problematic PDF. One curious thing is that this attached file used to display correctly some time ago, including table recognizing. Now it doesn't break lines nor recognizes tables. I cannot precise when it started to display badly, but it was probably during last year 2018. I'm using the latest NVDA 2019.1.1, Adobe Acrobat Reader DC 2019.010.20099, Windows 10 Pro 1809.

@Adriani90
Copy link
Collaborator

cc: @LeonarddeR

@DrSooom
Copy link

DrSooom commented Oct 3, 2019

Could somebody check if this issue is still valid with Adobe Reader DC 19.012.20040 and NVDA 2019.2.1?

@cary-rowen
Copy link
Contributor

According to my test, Foxit Reader does not have this problem, but this problem is still important, and it would be great if it can be solved.

@cary-rowen
Copy link
Contributor

There have been multiple duplicate Issues so I believe people are very concerned about this issue.
@feerrenrut @LeonarddeR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
p4 https://github.com/nvaccess/nvda/blob/master/projectDocs/issues/triage.md#priority
Projects
None yet
Development

No branches or pull requests

10 participants