Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract data, objects, elements from the PDF #8

Closed
pknabe opened this issue Oct 19, 2023 · 2 comments
Closed

Extract data, objects, elements from the PDF #8

pknabe opened this issue Oct 19, 2023 · 2 comments

Comments

@pknabe
Copy link

pknabe commented Oct 19, 2023

What do I want to do?
I would like to extract content from a PDF in PHP using the Smalot/PDFparser library (already installed and running).
What do I want to extract?
I would like to extract the PDF's content from a very specific level/layer.
Secondly, I would like to hide these levels individually.
I would like to save the content of the extracted level a new PDF and/or output it as XML.
I add an example PDF.
It would be nice if you could show me the problem mentioned in a small code example.

My development and system environment:
I develop on a Windows (10) machine under Laragon (Localhost) with PHP.
example_pdf.pdf

Thanks for your help in advance.

@fahadadeel
Copy link

@pknabe

Based on your requirement to extract content from a specific level or layer of a PDF and then manipulate it (like hiding these levels or saving the content as a new PDF/XML), it's worth noting that while the Smalot\PdfParser library in PHP is adept at extracting text, images, and other basic elements from PDFs, it may not natively support the nuanced task of interacting with specific layers or levels of a PDF document directly.

As of my knowledge, the library is primarily focused on extracting rudimentary elements and might not provide functionalities for detailed layer or level manipulation. Such tasks often involve understanding and altering the PDF's structure, which can be complex and is not typically within the purview of basic parsing libraries.

However, if you have found a solution or a workaround that fits within the scope of PHP and Smalot\PdfParser, it would be great to share it with the community. Thanks

@pknabe
Copy link
Author

pknabe commented Jan 4, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants