Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coordinate calculation errors in getDataTm() #558

Open
oliver681 opened this issue Nov 12, 2022 · 0 comments
Open

Coordinate calculation errors in getDataTm() #558

oliver681 opened this issue Nov 12, 2022 · 0 comments

Comments

@oliver681
Copy link
Contributor

oliver681 commented Nov 12, 2022

  • PHP Version: 8.1.11
  • PDFParser Version: 2.2.1

First of all, I am new to the PDF-specification and just read the text section (9.). So in case I understood someting wrong, feel free to edit this issue.

Description:

Some positions are calculated incorrectly in diffenrent situations

The text showing operators Tj and TJ do not adjust the x position:

  • Wrong x-coordinate when multiple Tj operators are used in same line: The x-coordinate should be moved to the ending of the last printed glyph.
  • Wrong x-coordinate when multiple TJ operators are used in same line: Same as above, in addition the TJ operator allows individual glyph positioning of the given strings (see pdf reference 9.4.3). These are not taken into account.

These two issues lead to the following situation: If there are multiple Tj/TJ operators in the same line, getDataTm() will return the same x-coordinate for all strings.

As a result of adapting this, all the operators that move the origin of the text space to the next line (like T*, ' and ") should reset the x-coordinate. I think that has to be done with the Tlm (Text line matrix).

  • Text-positioning operators Td and TD don't reflect scaling of text:
    The text can be scaled either with the Tf, Tfs or Tm operators. When the text-positioning operators Td or TD are used the coordinates are updated without taking in account the current scaling set by Tf or Tm, thus resulting in an incorrect translation of the current position. Correcting this will also solve Error in Y coordinate #532.

Consideration of other text space paramenters: The following text space parameters are also not being considered during the calculation of the x,y-coordinates:

  • For the x-coordinate: Tc (character spacing), Tw (word spacing), Th (horizontal scaling)
  • For the y-coordinate: Trise (text rise)

The PDF-specification contains a formula for adapting the x,y-coordinates during horizontal or vertical writing under 9.4.4 (Text space details)

There is a description of all text operands in the PDF-specification under 9.: https://opensource.adobe.com/dc-acrobat-sdk-docs/standards/pdfstandards/pdf/PDF32000_2008.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant