Extract_Text extract character of embedded PDF #981
flashpixx
started this conversation in
Ask for help with specific PDFs
Replies: 1 comment
-
Hi @flashpixx, and very interesting. I'm not familiar with LaTeX's |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
I have got a LaTeX generated PDF, which uses the
\includepdf
call to embed other PDF. Now I'm using PDFPlumber to extract the text (it is a test-case for any PDF later). In general extract_text works fine, but on the embedded pages I get "character garbage" back e.g. (partly extracted)How can I avoid that this text is returned on extract_text for a whole page, which contains another pdf? Is it possible this is a font information?
Thanks
Beta Was this translation helpful? Give feedback.
All reactions