Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Textlayer out of alignment #332

Closed
jimmythompson opened this issue Jan 10, 2019 · 32 comments
Closed

Textlayer out of alignment #332

jimmythompson opened this issue Jan 10, 2019 · 32 comments
Labels
question Further information is requested

Comments

@jimmythompson
Copy link

jimmythompson commented Jan 10, 2019

Description

The textContent on this particular PDF seems out of alignment with what's rendered on the canvas, meaning copying and pasting is really difficult. This appears to be fine in vanilla PDF.js.

I'll try and look into this myself, but I'm not really sure how the whole thing works. Any help fixing this would be lovely. 😄

Steps to reproduce

  1. Try rendering the following PDF in React PDF: http://moco17.movementcomputing.org/wp-content/uploads/2017/12/poster4-donneaud.pdf
  2. Try highlighting and copying text from the abstract, a good example being "This soft input device - Musical Skin".

Expected behavior

It should highlight and copy the text your cursor is running over.

Additional information

If applicable, add screenshots (preferably with browser console open) and files you have an issue with to help explain your problem.

Environment

  • Browser: 71.0.3578.98
  • React-PDF version: 4.0.2
  • React version: 16.4.2
@matt-erhart
Copy link

matt-erhart commented Jan 16, 2019

chrome_2019-01-16_12-17-43

Same here. I'm trying out the transform from the svg example from pdfjs:

const adjustTransMatForViewport = (viewportTransform, textItemTransform) => {
  // we have to take in account viewport transform, which includes scale,
  // rotation and Y-axis flip, and not forgetting to flip text.
  return pdfjsLib.Util.transform(
    pdfjsLib.Util.transform(viewportTransform, textItemTransform),
    [1, 0, 0, -1, 0, 0]
  );
};

But I can't get the xScale quite right.

Image is from react-pdf example unchanged.

@nikonet
Copy link

nikonet commented Jan 28, 2019

I had the same problem on window resize and scale changes. Solved it by calling a function that removes some of the CSS applied on the text-content page on each pageLoadSuccess. It seems to me that the default top, left and transform CSS properties on the textContent elements doesn't do anything useful. Those styles are inline, so needs to be changed with JS.

function removeTextLayerOffset() {
  const textLayers = document.querySelectorAll(".react-pdf__Page__textContent");
    textLayers.forEach(layer => {
      const { style } = layer;
      style.top = "0";
      style.left = "0";
      style.transform = "";
  });
}

<Page onLoadSuccess={removeTextLayerOffset}/>

@wojtekmaj
Copy link
Owner

wojtekmaj commented Jan 28, 2019

@jimmythompson Tested this on my test page and it looks fairly alright. The text layer is slightly misaligned indeed so I'll have a second look at it when I'm on my desktop, but such extreme misalignments (50% of page width) is likely caused by putting Page in a too small container or similar styling issue. In such cases Canvas layer would simply expand because it's one static element) and text layer would overflow it's 0 pixels-wide container (because it's a set of absolutely placed elements).

@joedukk
Copy link

joedukk commented Jan 30, 2019

I saw this issue as well. My temp fix was to apply the following CSS to the react-pdf__Page element -

.react-pdf__Page.pdf-page { display: flex; justify-content: center; }

@peterkmurphy
Copy link

With some PDFs, the text layer appears as invisible text across other controls on the screen - making it hard for users to interact with it. They see the mouse cursor as a textual "I" cursor even though they're over a radio button; they wouldn't be aware of the reason why unless they tried to select the control, and then see the text highlighted in reverse.

While this bug is being fixed, a suggestion for those that just need to see the PDF, but not copy and paste. That's what I'll do in the meantime.

.react-pdf__Page__textContent {display: none;}

@wojtekmaj
Copy link
Owner

@petermurphy You can disable text layer using renderTextLayer prop (see details in README). It's a much better way.

After testing for a long time I could reproduce the issue where the lines on the text layer are moved down by 1 line. No other issues were found.

Alignment before:

obraz

Alignment after:

obraz

I'd say it's pretty satisfactory for the means we have. This was fixed with 9dbc54e and will be released with the next release.

@jjlauer
Copy link

jjlauer commented Mar 7, 2019

@wojtekmaj -- beautiful library. Truly appreciate all the hard work you've put into it!

I have similar text alignment issues (react-pdf v4.0.5) where some of the spans are off by hundreds of pixels to the image underneath. To be clear, many PDFs render similar to your examples above where they are super close to being aligned correctly. However, there are quite a few that are really misaligned. Here is a screenshot of the text layer being set to red:

fireshot capture 187 - henry ford gb gmail com - localhost

The same file rendered with PDF.js in their online demo has a perfect text layer for the exact same PDF. Just curious if react-pdf is heavily modifying the textLayer created by PDF.js, if some kind of styling I have is causing it, or something else.

I've attached the PDF with the worst misalignment. The "Disney Store" in top portion is where its off by hundreds of pixels.

00694003239620190303.pdf

If you take the same PDF and render it in https://mozilla.github.io/pdf.js/web/viewer.html you'll notice the text layer is pixel perfect.

Taking a peek at the spans under-the-hood, in react-pdf, the "Disney Store" span looks like this:

<span style="height: 1em; font-family: sans-serif; font-size: 13.0392px; position: absolute; top: 154.094px; left: 75.1059px; transform-origin: left bottom 0px; white-space: pre; pointer-events: all;">           Disney Store # 00694</span>

PDF.js demo looks like this:

<span style="left: 76.8px; top: 157.569px; font-size: 13.3333px; font-family: monospace; transform: scaleX(0.996982);">           Disney Store # 00694</span>

Obviously the width of the two examples is different so I don't expect them to the same. However, the version in react-pdf is setting the height, the font-family is different, and the transforms are pretty different.

@jjlauer
Copy link

jjlauer commented Mar 11, 2019

@wojtekmaj I figured out the underlying cause of the misalignment in my previous example. It looks like pdf.js does font detection and swaps in monospace when it knows a serif-based font isn't being used. They also use a transform of scaling text horizontally based on the actual font in use. I spent some time thinking about building a PR to bring that same code in, but it turned out to be easier to delegate text layer rendering to pdf.js vs. trying to replicate what they were doing. After testing a lot of various PDFs, the text selection via pdf.js seems to work better across a wide variety of fonts.

I implemented everything this fork: https://github.com/greenback-inc/react-pdf

If you are interested in possibly accepting this major change, I'd be happy to submit a PR. We'd lose the ability to customize the text rendered. Alternatively, we could allow users to select a pdf.js text layer as an alternative to the native react-pdf one.

@katiewoolston
Copy link

katiewoolston commented Apr 9, 2019

I'm having a similar problem, even with the simple example from the readme - the canvas and text layer are horizontally misaligned by quite a lot. I can fix it by hardcoding the width of the Page component so it's the same as the canvas. It looks like the reason for the misalignment is that the canvas is fixed width, whereas the text layer is centred in the available space. I've tried modifying the text layer or canvas css, but it gets problematic when you try to scale or rotate.

image

Note: fixed this issue by applying display: inline-block; to the Page component.

@Stani2980
Copy link

@nikonet That worked like a charm in my case, aligning every site near perfect !

@Abishek-Sudhakaran
Copy link

@nikonet superb dude..it worked well and good

@vctormb
Copy link

vctormb commented May 25, 2019

I've tried all the solutions here and for me it didn't work. No idea how to solve it.

My problem is: I'm rendering a pdf in a small container (Modal) and everything is messing around.

@wojtekmaj
Copy link
Owner

@vctormb Most important is to ensure that your rendered PDF fits its container. If it does not, PDF layer will shrink, but text layer will not.

@vctormb
Copy link

vctormb commented Jun 21, 2019

@wojtekmaj It fits ok but when I resize the screen slowly (not too much) the PDF layer shrink.

@wojtekmaj
Copy link
Owner

And by default it will do that (but I'm planning to remove it for 5.0 because that causes more confusion than real benefits). You can adjust CSS for it not to resize the canvas/svg, or use a window resize event handler to provide proper dimensions for React-PDF for Text Layer to also be rendered smaller.

@alex-mironov
Copy link

@wojtekmaj can you please help to figure out what causes the issue with the text layer offset? The dimensions of container and page seem to be in line, but the text is shifted and the selection line height seems also quite big.
Screen Shot 2019-10-18 at 17 10 47
I would be thankful for any ideas!

@alex-mironov
Copy link

alex-mironov commented Oct 18, 2019

Screen Shot 2019-10-18 at 18 42 35

From what I can see, span is positioned correctly, but the text itself (red) gets pushed down.

@juanvalag
Copy link

Try these styles:
2019-10-27_194417

@jacksleight
Copy link

I was having an issue where the text layer was appearing offset vertically because the PDF was being rendered inside a parent that set line-height (to 1.5 in this instance), and I don't think react-pdf resets this.

Screen Shot 2020-02-12 at 12 19 42

The fix was to reset line-height on the document wrapper:

.react-pdf__Document {
    line-height: 1;
}

Screen Shot 2020-02-12 at 12 20 03

@zdrukteinis
Copy link

zdrukteinis commented Mar 3, 2020

Added removeTextLayerOffset function on page load success, provided by @nikonet :

function removeTextLayerOffset() {
  const textLayers = document.querySelectorAll(".react-pdf__Page__textContent");
    textLayers.forEach(layer => {
      const { style } = layer;
      style.top = "0";
      style.left = "0";
      style.transform = "";
  });
}

<Page onLoadSuccess={removeTextLayerOffset}/>

Also, added these styles (the key here is opacity as 0.1):

.react-pdf__Page__textContent span {
	opacity: 0.1;
}

.react-pdf__Page__textContent span::selection {
	background-color: blue;
}

.react-pdf__Document {
	line-height: initial;
}

Seems to work quite well:

image

@valeedmalik
Copy link

I needed to tweak the solution from @nikonet in order to get it to work.

function removeTextLayerOffset() {
  const textLayers = document.querySelectorAll('.react-pdf__Page__textContent');
  textLayers.forEach((layer) => {
    const { style } = layer;
    style.display = 'none';
  });
}

<Page onLoadSuccess={removeTextLayerOffset}/>

@komret
Copy link

komret commented Aug 25, 2020

I was also facing this issue and while @nikonet's tweak did help, I found the cause to be in the annotation layer not being styled properly. I forgot to import the stylesheet for it. BTW, docs metion the path 'react-pdf/dist/esm/Page/AnnotationLayer.css', but it was 'react-pdf/dist/Page/AnnotationLayer.css' in my case.

Similar problem appeared when i zoomed in, causing the PDF to overflow its container. I solved this more or less the same way as @katiewoolston did, by applying display: table to the Page element.

@rahulIcreon
Copy link

I had the same problem on window resize and scale changes. Solved it by calling a function that removes some of the CSS applied on the text-content page on each pageLoadSuccess. It seems to me that the default top, left and transform CSS properties on the textContent elements doesn't do anything useful. Those styles are inline, so needs to be changed with JS.

function removeTextLayerOffset() {
  const textLayers = document.querySelectorAll(".react-pdf__Page__textContent");
    textLayers.forEach(layer => {
      const { style } = layer;
      style.top = "0";
      style.left = "0";
      style.transform = "";
  });
}

<Page onLoadSuccess={removeTextLayerOffset}/>

How can we do this with TypeScript, please?

@doguhanciftci
Copy link

I had the same problem on window resize and scale changes. Solved it by calling a function that removes some of the CSS applied on the text-content page on each pageLoadSuccess. It seems to me that the default top, left and transform CSS properties on the textContent elements doesn't do anything useful. Those styles are inline, so needs to be changed with JS.

function removeTextLayerOffset() {
  const textLayers = document.querySelectorAll(".react-pdf__Page__textContent");
    textLayers.forEach(layer => {
      const { style } = layer;
      style.top = "0";
      style.left = "0";
      style.transform = "";
  });
}

<Page onLoadSuccess={removeTextLayerOffset}/>

How can we do this with TypeScript, please?


function removeTextLayerOffset() {
        const textLayers = window.document.querySelectorAll(".react-pdf__Page__textContent");
        textLayers.forEach((layer: any) => {
            const { style } = layer;
            style.top = "0";
            style.left = "0";
            style.transform = "";
        });
    }

@nico971
Copy link

nico971 commented Jan 17, 2023

For me solved by simply add
<Page renderTextLayer={false} renderAnnotationLayer={false}/>

@ritik-peoplelink
Copy link

Yess It's working Perfectly

@aswathykrishna18
Copy link

Try importing the css files
import 'react-pdf/dist/cjs/Page/AnnotationLayer.css'; import 'react-pdf/dist/cjs/Page/TextLayer.css';

@CoderAWei
Copy link

Try these styles: 2019-10-27_194417

It worked, but the animation is disappear

@GarikHk
Copy link

GarikHk commented Jul 10, 2023

Hey, I am having a similar issue. On resize, when I make my window smaller the text layer gets crazy. I was thinking maybe the font itself doesn't change its size only container does. I have tried most of the suggested solutions but it doesn't seem to work. Any help would be nice...

.document-content .react-pdf__Page canvas, .react-pdf__Page__textContent, .react-pdf__Page__annotations { max-width: 100%; height: auto !important; }
I added this so the the pdf could be responsive and textLayer/annotations wouldn't get out of their container

pdfResizeBug

@wojtekmaj
Copy link
Owner

@GarikHk You're not supposed to resize rendered elements using other techniques than passing width, height or scale to React-PDF components. Doing so will result in layers misalignment, because React-PDF can't possibly know that you've resized canvas using CSS.

@GarikHk
Copy link

GarikHk commented Jul 22, 2023

Thank you for the clarification.

@hritikb27
Copy link

The issue seems to persist, although after trying @nikonet's solution the offset reduced, but I still get it and my need is to highlight the text on the page so I cannot hide the annotation or text layers.

I tried to use highlighting recipe provided in the docs but having same issue

react-pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests