Skip to content
This repository has been archived by the owner on Apr 15, 2024. It is now read-only.

Bounding Box Y coords reflect distance from bottom of document #19

Closed
danshultz opened this issue Sep 27, 2012 · 2 comments
Closed

Bounding Box Y coords reflect distance from bottom of document #19

danshultz opened this issue Sep 27, 2012 · 2 comments

Comments

@danshultz
Copy link

The y Coordinates are in the bbox elements for the textline and the text elements are relative to the bottom of the document instead of the top.

pdfminer==20110515

?xml version="1.0" encoding="utf-8" ?>
<pages>
<page id="1" bbox="0.000,0.000,612.000,720.000" rotate="0">
<figure name="Im0" bbox="0.000,0.000,612.000,720.000">
<image width="612" height="720" />
</figure>
</page>
<page id="2" bbox="0.000,0.000,612.000,720.000" rotate="0">
<textbox id="0" bbox="65.694,479.087,295.324,490.987">
<textline bbox="65.694,479.087,295.324,490.987">
<text font="UANQRD+FrutigerLTStd-BoldCn" bbox="65.694,479.087,70.324,490.987" size="11.900">F</text>
<text font="UANQRD+FrutigerLTStd-BoldCn" bbox="70.224,479.087,73.004,490.987" size="11.900">i</text>
<text font="UANQRD+FrutigerLTStd-BoldCn" bbox="72.904,479.087,78.464,490.987" size="11.900">r</text>
<text font="UANQRD+FrutigerLTStd-BoldCn" bbox="78.364,479.087,86.514,490.987" size="11.900">m</text>

The "F" character actually starts on 720-479 and not 479.

@titusz
Copy link

titusz commented Sep 27, 2012

The PDF coordinate system has its origin in lower left... so that might be the reason ...

@danshultz
Copy link
Author

Thanks..that would make sense then :)

Guess I didn't read the pdf docs well enough

side2k pushed a commit to side2k/pdfminer that referenced this issue Jan 20, 2017
euske#34)

* Removing all the "#!/usr/bin/env python" lines, they do not need for python3, solving issue number: euske#19.

* Restored all the shebangs in the tools and tests folders (because they are real executables) but used "#!/usr/bin/env python" instead of "#!/usr/bin/python" as this blog points out: https://www.peterbe.com/plog/importance-of-env
Removed also the shebang from pdfminer/psparser.py file.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants