-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract data, objects, elements from the PDF #8
Comments
Based on your requirement to extract content from a specific level or layer of a PDF and then manipulate it (like hiding these levels or saving the content as a new PDF/XML), it's worth noting that while the Smalot\PdfParser library in PHP is adept at extracting text, images, and other basic elements from PDFs, it may not natively support the nuanced task of interacting with specific layers or levels of a PDF document directly. As of my knowledge, the library is primarily focused on extracting rudimentary elements and might not provide functionalities for detailed layer or level manipulation. Such tasks often involve understanding and altering the PDF's structure, which can be complex and is not typically within the purview of basic parsing libraries. However, if you have found a solution or a workaround that fits within the scope of PHP and Smalot\PdfParser, it would be great to share it with the community. Thanks |
Hello Fahad,
Even though I didn't really expect any feedback after such a long time,
I would of course like to thank you very much for your answer. In any
case, it is now clear and we have to rely on a different solution in
this project.
Best regards and thank you again for your effort.
Paulo
…------ Originalnachricht ------
Von: "Fahad Adeel" ***@***.***>
An: "fileformat-free-consulting/projects" ***@***.***>
Cc: "pknabe" ***@***.***>; "Mention" ***@***.***>
Gesendet: 03.01.2024 10:28:35
Betreff: Re: [fileformat-free-consulting/projects] Extract data,
objects, elements from the PDF (Issue #8)
@pknabe <https://github.com/pknabe>
Based on your requirement to extract content from a specific level or
layer of a PDF and then manipulate it (like hiding these levels or
saving the content as a new PDF/XML), it's worth noting that while the
Smalot\PdfParser library in PHP is adept at extracting text, images,
and other basic elements from PDFs, it may not natively support the
nuanced task of interacting with specific layers or levels of a PDF
document directly.
As of my knowledge, the library is primarily focused on extracting
rudimentary elements and might not provide functionalities for detailed
layer or level manipulation. Such tasks often involve understanding and
altering the PDF's structure, which can be complex and is not typically
within the purview of basic parsing libraries.
However, if you have found a solution or a workaround that fits within
the scope of PHP and Smalot\PdfParser, it would be great to share it
with the community. Thanks
—
Reply to this email directly, view it on GitHub
<#8 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ARAS5DMC3YYAUZFDMLUYWW3YMUXFHAVCNFSM6AAAAAA6HRGXQGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZVGE2DMMBSGM>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
What do I want to do?
I would like to extract content from a PDF in PHP using the Smalot/PDFparser library (already installed and running).
What do I want to extract?
I would like to extract the PDF's content from a very specific level/layer.
Secondly, I would like to hide these levels individually.
I would like to save the content of the extracted level a new PDF and/or output it as XML.
I add an example PDF.
It would be nice if you could show me the problem mentioned in a small code example.
My development and system environment:
I develop on a Windows (10) machine under Laragon (Localhost) with PHP.
example_pdf.pdf
Thanks for your help in advance.
The text was updated successfully, but these errors were encountered: