NVDA not line breaking text in PDFs #7275

Neurrone · 2017-06-12T15:01:07Z

When browsing PDFs in jaws vs NVDA in adobe reader, NVDA doesn't break across lines (e.g, especially across source code), while jaws breaks it appropriately.

derekriemer · 2017-06-12T19:12:21Z

Can you please provide steps to reproduce this problem?

…

On Mon, Jun 12, 2017 at 9:01 AM, Dickson Tan ***@***.***> wrote: When browsing PDFs in jaws vs NVDA in adobe reader, NVDA doesn't break across lines (e.g, especially across source code), while jaws breaks it appropriately. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#7275>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFGivZFu2lIa7-qk153SMBcmvp6Ygp2Nks5sDVK4gaJpZM4N3Pdi> .

-- Derek Riemer: Improving the world one byte at a time! - University of Colorado Boulder Department of computer science, 4th year undergraduate student. - Accessibility enthusiast. - Proud user of the NVDA screen reader. - Open source enthusiast. - Skier. Personal website <http://derekriemer.com>

feerrenrut · 2017-06-13T05:41:46Z

@Neurrone In particular it would be handy if you could provide a PDF which demonstrates this issue. Thanks for bringing this to our attention!

jcsteh · 2017-06-13T06:04:15Z

PDF has semantic tags for paragraphs, lists, tables and the like. However, it does not differentiate author inserted line breaks (as in source code or poetry, sometimes known as hard line breaks) from line breaks used to wrap text which cannot fit on a single line (sometimes known as soft line breaks). Because NVDA splits text into lines itself (according to the "Maximum number of characters on one line" Browse Mode setting), we strip line break characters, as otherwise, you end up with a lot of long lines followed by short lines (as I recall happened in JAWS when I used it years ago). Having spoken to someone involved in PDF accessibility specification writing, my understanding is that the correct way to author such content is to tag each line as a separate list item or paragraph. Unfortunately, it seems no one actually does this in the wild.

I think the only way we could reasonably solve this is to ignore NVDA's own settings for splitting lines and instead use only the line breaks in the PDF. That would also require us to not treat line breaks as paragraphs for PDF. This would be somewhat inconsistent with browse mode everywhere else, but I think consistency is probably outweighed by usability here.

Neurrone · 2017-06-13T06:13:43Z

@feerrenrut the pdfs i've been dealing with have copyright on them, so i wasn't sure how i could legally distribute a sample. Wanted to put it up here first to see if anyone knew what was happening before I cut a page out of the pdf or something.

feerrenrut · 2017-06-13T07:02:52Z

That's ok, thanks @Neurrone. If you happen to come across one that you can safely distribute, please do attach it. This will make life easier for the person who works on this, since they can get started straight away without first having to find a test case.

Neurrone · 2017-06-16T14:23:16Z

@derekriemer @feerrenrut I've found a free PDF I can link to to demonstrate.

This is a sample from the book Haskell Programming from First Principles.

On page 3 (press ctrl+shift+n and type 3 to directly go to it), this is the output in NVDA for the source code near the middle of the page.

Now try entering some simple arithmetic at your prompt: Prelude> 2 + 2 4Prelude> 7 < 9 True 
Prelude> 10 ^ 2 100

But the output in jaws is

Now try entering some simple arithmetic at your prompt: 
Prelude> 2 + 2 
4
Prelude> 7 < 9 
True 
Prelude> 10 ^ 2 
100

The latter is obviously much easier to read, especially when code gets more complex. This also happens in many other places, e.g section titles.

Neurrone · 2017-06-16T14:25:33Z

Sidenote, how are the priority labels chosen? Feel that this should be medium severity at least, given how common PDFs are, and fixing would be a big usability win.

jcsteh · 2017-06-16T21:12:05Z

Working around this requires a pretty significant refactor of the way lines work in virtual buffers. Also, while I accept this happens in the wild, this is technically incorrect authring according to PDF accessibility standards.

Neurrone · 2017-06-17T04:31:26Z

@jcsteh noted. There's probably no chance that tools being used in the wild will get fixed though (PdfLaTeX is one of the worst offenders).

Brian1Gaff · 2017-06-17T10:02:54Z

I think here we have the nub of many of the problems we see, particularly in PDFs. IE when talking to those making these files you are in fact talking to people who know nothing about accessibility or tagging or any of that cool stuff we rely on. This is why, in my view, whatever we do with trying to fix things at the screenreader level, we should instead be trying to get the authoring packages modified so people cannot ignore the accessibility parts of the job. Quite who you lobby for this I have no idea. Brian bglists@blueyonder.co.uk Sent via blueyonder. Please address personal email to:- briang1@blueyonder.co.uk, putting 'Brian Gaff' in the display name field.

Neurrone · 2017-06-17T16:23:10Z

@Brian1Gaff it would be nice if the tools being used to generate PDFs got updated so they are standards compliant, but I'm not holding my breath It might happen for e.g, word but I don't think for PdfLaTeX.

Even if they got fixed, we still have to deal with existing PDFs, so we'd still have to support them.

dan1982code · 2017-06-28T23:40:10Z

I have a simple example attached here. It would be great if this could be fixed.
test.pdf

Cleversn · 2019-05-08T13:21:58Z

RD-800_PA.pdf
Here's another problematic PDF. One curious thing is that this attached file used to display correctly some time ago, including table recognizing. Now it doesn't break lines nor recognizes tables. I cannot precise when it started to display badly, but it was probably during last year 2018. I'm using the latest NVDA 2019.1.1, Adobe Acrobat Reader DC 2019.010.20099, Windows 10 Pro 1809.

Adriani90 · 2019-05-08T17:27:30Z

cc: @LeonarddeR

DrSooom · 2019-10-03T17:24:42Z

Could somebody check if this issue is still valid with Adobe Reader DC 19.012.20040 and NVDA 2019.2.1?

cary-rowen · 2021-03-31T03:52:27Z

According to my test, Foxit Reader does not have this problem, but this problem is still important, and it would be great if it can be solved.

cary-rowen · 2021-03-31T03:55:10Z

There have been multiple duplicate Issues so I believe people are very concerned about this issue.
@feerrenrut @LeonarddeR

jcsteh added the p4 https://github.com/nvaccess/nvda/blob/master/projectDocs/issues/triage.md#priority label Jun 13, 2017

jcsteh mentioned this issue Jul 19, 2017

NVDA ignoring line breaks in PDF documents with left-to-right, top-to-bottom reading order #1216

Closed

surfer0627 mentioned this issue May 2, 2019

Adobe Acrobat DC: When copying / reading text in browse mode, separate lines are merged together #9370

Closed

DrSooom mentioned this issue Jul 6, 2019

nvda displays the contents of a .pdf file in large lines #9882

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NVDA not line breaking text in PDFs #7275

NVDA not line breaking text in PDFs #7275

Neurrone commented Jun 12, 2017

derekriemer commented Jun 12, 2017 via email

feerrenrut commented Jun 13, 2017

jcsteh commented Jun 13, 2017

Neurrone commented Jun 13, 2017

feerrenrut commented Jun 13, 2017

Neurrone commented Jun 16, 2017 •

edited

Neurrone commented Jun 16, 2017

jcsteh commented Jun 16, 2017 via email

Neurrone commented Jun 17, 2017

Brian1Gaff commented Jun 17, 2017 via email

Neurrone commented Jun 17, 2017

dan1982code commented Jun 28, 2017

Cleversn commented May 8, 2019

Adriani90 commented May 8, 2019

DrSooom commented Oct 3, 2019

cary-rowen commented Mar 31, 2021

cary-rowen commented Mar 31, 2021

NVDA not line breaking text in PDFs #7275

NVDA not line breaking text in PDFs #7275

Comments

Neurrone commented Jun 12, 2017

derekriemer commented Jun 12, 2017 via email

feerrenrut commented Jun 13, 2017

jcsteh commented Jun 13, 2017

Neurrone commented Jun 13, 2017

feerrenrut commented Jun 13, 2017

Neurrone commented Jun 16, 2017 • edited

Neurrone commented Jun 16, 2017

jcsteh commented Jun 16, 2017 via email

Neurrone commented Jun 17, 2017

Brian1Gaff commented Jun 17, 2017 via email

Neurrone commented Jun 17, 2017

dan1982code commented Jun 28, 2017

Cleversn commented May 8, 2019

Adriani90 commented May 8, 2019

DrSooom commented Oct 3, 2019

cary-rowen commented Mar 31, 2021

cary-rowen commented Mar 31, 2021

Neurrone commented Jun 16, 2017 •

edited