Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coordinates behaves differently for text and graphics in rotated page #463

Closed
twnaing opened this issue Mar 11, 2020 · 10 comments
Closed

coordinates behaves differently for text and graphics in rotated page #463

twnaing opened this issue Mar 11, 2020 · 10 comments
Assignees
Labels

Comments

@twnaing
Copy link

twnaing commented Mar 11, 2020

Please provide all mandatory information!

Describe the bug (mandatory)

The coordinates behaves differently for Text and Graphics (circle, rectangle) in rotated page.

To Reproduce (mandatory)

  • get a pdf with rotated page (e.g. rotate PDF online)
  • in python program, create a Point
  • use the same x and y for addText
  • use the same x and y for drawCircle
  • use the same x and y (as x0, y0) for drawRect
  • observe circle, text and rectangle all over the place

Expected behavior (optional)

  • drawn circle and text should appear at the same location
  • Top Left of drawn rectangle should be aligned with center of drawn circle

Screenshots (optional)

before manipulation
pymupdf-before-Screenshot_2020-03-11_17-21-43

after manipulation
pymupdf-Screenshot_2020-03-11_16-46-07

please ignore other extra rectangle at the bottom and text in center

Your configuration (mandatory)

  • Operating system, potentially version and bitness: Manjaro linux 64bit
  • Python version, bitness: Python 3.7.6 (default, Mar 10 2020, 13:49:53) [GCC 9.2.1 20200130]
  • PyMuPDF version, installation method (wheel or generated from source): 1.6.10 & 1.6.11 via pip/pipenv

Additional context (optional)

The PDF has been checked for this issue
https://pymupdf.readthedocs.io/en/latest/faq/#misplaced-item-insertions-on-pdf-pages

code used

rotation = page.rotation # or set manually to 90
green = fitz.utils.getColor('green')
page.drawCircle(fitz.Point(20, 20), 2, color=green)
page.insertText( fitz.Point(20, 20), f'20, 20', rotate = rotation, fontsize = 9)
page.drawRect( fitz.Rect(20, 20, 120, 120), color = green, fill = green)

file used

@JorjMcKie
Copy link
Collaborator

Oha!
Bug confirmed. Thanks for catching this.

JorjMcKie added a commit that referenced this issue Mar 12, 2020
@JorjMcKie
Copy link
Collaborator

As you can see, I have located the error and published a fixe with version 1.16.12.
The new version is already in the releases folder of this repo.
Still in the process of also uploading it to PyPI.
Once the latter is done I will close this issue.

Background of the cause:
This seems to be a MuPDF bug: There is a MuPDF function which translates between MuPDF's coordinate system and the one of PDF itself. If a PDF page has a rotation != 0, the respective MuPDF translation matrix is erroneous. I happened to not use this matrix for text insertions only, but do my own calculation - which is why those cases were correct 😎.
For the fix I have changed the calculation globally in PyMuPDF and confirmed that methods draw*(...), insertImage(...) and showPDFpage(...) now all do there insertions consistently on the unrotated page.
As a consequence, the image rectangle returned by Page.getImageBbox(...) has rotated coordinates for a rotated page!

@JorjMcKie
Copy link
Collaborator

PyPI upload now complete.
Going to close this issue. Please drop a note if releated issues are popping up.

@twnaing
Copy link
Author

twnaing commented Mar 13, 2020

As a consequence, the image rectangle returned by Page.getImageBbox(...) has rotated coordinates for a rotated page!

@JorjMcKie, this means "co-ordinates will be from Top Left of un-rotated page", right? With 1.6.12 release with previous code, I get the following.

pymupdf-1 16 12-Screenshot_2020-03-13_11-55-31

Note the text, circle and rectange at the 20, 20 from top left of the un-rotated page (top-right of the 90 deg rotate page)

@JorjMcKie
Copy link
Collaborator

JorjMcKie commented Mar 13, 2020

@twnaing - exactly right.

My approach of repairing that bug has its limitations of course. And I'm afraid they are inevitable limitations, too.

You luckily didn't also complain about inconsistencies, when inserting annotations on a rotated page 😎! They too behave inconsistently in the same way as your text and image examples ...
And I see (currently) no way how to address this, because annotation handling is done by MuPDF itself - behind the curtain from PyMuPDF code's point of view (drawing and image insertion is code which I have developed myself, so have better control over it). And of course I do not want to rewrite their code ...

There is a rather dirty circumvention for all of this:

  1. set page rotation to 0 (``page.setRotation(0)).
  2. do your stuff like normal (annotation insertion, info extraction, ...)
  3. when done, set rotation again to the original value.

@twnaing
Copy link
Author

twnaing commented Mar 15, 2020

So, the summary is

  • Page.rect and Page.bound() returns x and y using top left of rotated page
  • Page.insertX uses x and y from top left of un-rotated page
  • Page.addXAnnot uses x and y from top left of rotated page

@JorjMcKie
Copy link
Collaborator

JorjMcKie commented Mar 15, 2020

Yes, exactly. Let me add:

  • Page.rect and Page.bound() returns x and y using top left of rotated page
  • page.CropBox is always the unrotated page rectangle
  • Page.insertX, Page.draw*, Page.insertPDFpage use x and y from top left of un-rotated page
  • Page.addXAnnot uses x and y from top left of rotated page
  • Annot.rect returns the rotated rectangle if the annotation had been inserted on the non-rotated page. In v1.16.13 there will be a function to de-rotate a rectangle ...

@JorjMcKie JorjMcKie reopened this Mar 15, 2020
@JorjMcKie
Copy link
Collaborator

The remaining annoyance is about annotations. I will continue to investigate this ...

@twnaing
Copy link
Author

twnaing commented Mar 16, 2020

On inserting images around edges, I found that page.insertImage raise rect must be finite and not empty.

It traced back to utils.py:252

    r = page.rect & rect
    if r.isEmpty or r.isInfinite:
        raise ValueError("rect must be finite and not empty")

IMO, on line 250, it should be page.CropBox (instead of page.rect) as you mentioned the method uses x and y from top left of un-rotated page and page.rect uses rotated page.

Currently I use dirty circumvention

  • set page rotation to 0 (``page.setRotation(0)).
  • do your stuff like normal (annotation insertion, info extraction, ...)
  • when done, set rotation again to the original value.

@JorjMcKie
Copy link
Collaborator

You are right - thanks for the hint. I will change that one too. Will become effective in a version after 1.16.13.

@JorjMcKie JorjMcKie reopened this Mar 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants