Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Regression] Text-selection boxes visually overlap after PR 17533 #17561

Open
Snuffleupagus opened this issue Jan 22, 2024 · 16 comments
Open

[Regression] Text-selection boxes visually overlap after PR 17533 #17561

Snuffleupagus opened this issue Jan 22, 2024 · 16 comments

Comments

@Snuffleupagus
Copy link
Collaborator

Attach (recommended) or Link to PDF file here: https://github.com/mozilla/pdf.js/blob/master/test/pdfs/pdf.pdf.link

(This can be reproduce even with tracemonkey.pdf, but it's less pronounced since the lines don't overlap as much.)

Configuration:

Steps to reproduce the problem:

  1. Select text, and look at the intersection between different text-selection boxes.

What is the expected behavior? (add screenshot)
The intersection should not be visible, as previously:

old

What went wrong? (add screenshot)
The intersection is visible:

new

Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension): N/A

/cc @calixteman

@marco-c
Copy link
Contributor

marco-c commented Jan 22, 2024

This has not made its way to Firefox yet, right?

@Snuffleupagus
Copy link
Collaborator Author

This has not made its way to Firefox yet, right?

Correct.

@Snuffleupagus Snuffleupagus added the release-blocker Blocker for the upcoming release label Jan 22, 2024
@calixteman
Copy link
Contributor

I think this regression is acceptable with regard to the improvement we've in HCM and it helps to see the caret to make highlighting possible with the keyboard.
It's a visual annoyance, but I'd say that it doesn't avoid to use text selection and I checked few pdfs and in general there are no issues, so maybe this case is just a corner case.
That said, as far as I know there is no way to fix it without reverting the regressor patch BUT the height of the selection depends of the font, its size, the line height, ... which aren't correct because the font we use (serif/sans serif) is very likely not the one we've in the pdf. So the correct fix should be to use the font from the pdf in the text layer (I still have a WIP locally to do that).

@timvandermeij
Copy link
Contributor

timvandermeij commented Jan 27, 2024

Given the above, should we still mark this as a release blocker? I'm planning to make a new release tomorrow, and if this does block a new release I'll postpone that action.

Edit: Given the other release blocker ticket, I'll postpone the release with a week anyway and then see where we are.

@Snuffleupagus
Copy link
Collaborator Author

I think this regression is acceptable with regard to the improvement we've in HCM and it helps to see the caret to make highlighting possible with the keyboard.

I don't really agree with that assessment, since we have potential regressions for the majority of users this way.

It's a visual annoyance, but I'd say that it doesn't avoid to use text selection and I checked few pdfs and in general there are no issues, so maybe this case is just a corner case.

Unfortunately it can be a lot worse than just a minor annoyance in some cases, please see an example further below.

That said, as far as I know there is no way to fix it without reverting the regressor patch

Would it be feasible to only apply the new styles in HCM, to avoid affecting "normal" browsing?

So the correct fix should be to use the font from the pdf in the text layer (I still have a WIP locally to do that).

The problem is that it still won't fix these types of issues generally, since glyphs are often positioned absolutely and that means that we'll never get rid of all overlapping text-selection boxes. The example that I used in this issue contained vertical overlap, but it could just as well be horizontal overlap which can look a lot worse.

The following is an especially problematical example, since it uses Type3 fonts, but it does illustrate (much better) why we "need" to do something about it: https://github.com/mozilla/pdf.js/files/1274923/SMPembedded.2006.10.09b.pdf

overlap

@calixteman
Copy link
Contributor

In general we can't ship something which isn't accessible (it used to be the opposite but nowadays it's the policy), so we need to make highlighting keyboard accessible if we want to ship it.
The "annoyance" you mentioned here is less important for us than fixing a11y issues.
About the pdf you mentioned in your comment: it was already awful (in term of text selection) and I fully agree that now it's worse but between "awful" and "slightly more awful" and I don't see that much difference, so I still consider that it's a minor annoyance in general.
That said, I usually try to improve things instead of making them worse, so I've a WIP locally to draw the selection in using the svg layer we recently added , for example:
image
and it'll fix this bug and it'll let us really improve text selection in HCM (in using a filter) which is, right now, awful. In term of performance, I didn't see any noticeable regressions.

@calixteman
Copy link
Contributor

And for your information, I have a potential fix for the selection issue in Chrome.
I need to think about that globally but I've the feeling it should be possible to fix most of the selection issues we've.

@marco-c
Copy link
Contributor

marco-c commented Feb 2, 2024

I think this regression is acceptable with regard to the improvement we've in HCM and it helps to see the caret to make highlighting possible with the keyboard.

I don't really agree with that assessment, since we have potential regressions for the majority of users this way.

I agree we should avoid shipping this regression to users. I have filed https://bugzilla.mozilla.org/show_bug.cgi?id=1878258 to track this regression so that we make sure it is fixed before we release Firefox 124 (which currently includes the regression).

That said, we are considering the regression less severe than the improvement because while it's true that the regression affects a larger number of users, the improvement has a huge impact for users that currently are completely unable to select text and/or highlight text in the reader.

@timvandermeij
Copy link
Contributor

I agree we should avoid shipping this regression to users. I have filed https://bugzilla.mozilla.org/show_bug.cgi?id=1878258 to track this regression so that we make sure it is fixed before we release Firefox 124 (which currently includes the regression).

Thank you for the feedback. Given this information I'll hold off on releasing a new PDF.js version until this is fixed so there is enough time and we avoid any bug reports about this.

@timvandermeij
Copy link
Contributor

Given that this has been a release blocker for a long time now and the last PDF.js release was in December 2023, the question was raised in #17880 (comment) if we should go ahead and create a release now and just put a note in the release notes about this bug? I agree with this because we have had many features and bugfixes land since that users would like to have, and if we wait longer the release will only get bigger while we would like to keep releases small (as was the purpose of the monthly release cycle).

@calixteman @marco-c What do you think about this?

@marco-c
Copy link
Contributor

marco-c commented Apr 4, 2024

@calixteman if it's a quick and easy fix, I'd say let's wait on that. If it's a more cumbersome fix, I'd agree with shipping a version with the regression included.

@timvandermeij
Copy link
Contributor

@calixteman What do you think of the above? If it's desired I can make a new release today or tomorrow, for the reasons mentioned above and so we can land some other PRs that would be safer to merge once a new release is made.

@calixteman
Copy link
Contributor

Yes please make a release.
I'll try to land a patch in the next weeks.

@timvandermeij
Copy link
Contributor

timvandermeij commented Apr 11, 2024

I have just completed the release of PDF.js version 4.1.392, with this bug referenced as a known issue in the release notes at https://github.com/mozilla/pdf.js/releases/tag/v4.1.392. Landing of new patches should be unblocked now.

@calixteman The bot sadly doesn't seem to have pushed the release to NPM. Could you check the logs and push it manually so NPM is also in sync again (see https://github.com/mozilla/botio-files-pdfjs/blob/master/on_release.js#L21 for how the bots do it)? (This prompted me to look into porting the NPM part to GitHub Actions as part of #11851 again, and I think I now finally have an idea on how we can prepare and test this without actually having to push something to NPM.)

@calixteman
Copy link
Contributor

It was a matter of cached ssh key.
It should fine now:
https://www.npmjs.com/package/pdfjs-dist

@timvandermeij timvandermeij removed the release-blocker Blocker for the upcoming release label May 28, 2024
@timvandermeij
Copy link
Contributor

I have removed the release-blocker label because we're already tracking this with the regression label and as high priority on the project board, plus we've had several releases by now and haven't received new complaints about this as far as I have seen (so hopefully for most documents the change isn't really visible). We should still fix this, but it doesn't seem severe enough anymore to warrant the release-blocker label or separate inclusion in the release notes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: High priority
PDF.js quality
High priority
Development

No branches or pull requests

4 participants