Ignore text content that is positioned outside visible area of page #109

Closed
zagraves opened this Issue Mar 27, 2013 · 3 comments

Projects

None yet

2 participants

@zagraves

Thanks for implementing the option to use the CropBox.

Another thing I've noticed in the PDFs I've been working with, is that text outside the bounds of the CropBox is still marked up but displayed off the page.

For example, I have a PDF with printer crop marks, made up of text and images and are positioned outside the CropBox. This is shown in Acrobat here:
https://f.cloud.github.com/assets/17771/310993/58da058e-972c-11e2-95e4-52960b3f7fd3.png

It would be nice to strip any text that would be positioned outside the visible area, whether using --use-cropbox or not.

Can provide a PDF sample if needed.

Cheers
Zach

@coolwanglu
Owner

Thanks for reporting.

Yes, please provide a sample PDF file.

I'm not sure if it'll be easy to "physically" remove them, that parameter actually just triggers a switch of a poppler API.

@coolwanglu
Owner

I see. I didn't find an easy way, and this issue can be viewed as a simple case of #39. So marked as duplicated.

@coolwanglu coolwanglu closed this Mar 28, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment