Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fails when uploading file that contains comments within PDF #88

Closed
lauzonryd opened this issue Aug 19, 2021 · 8 comments
Closed

Fails when uploading file that contains comments within PDF #88

lauzonryd opened this issue Aug 19, 2021 · 8 comments

Comments

@lauzonryd
Copy link

When uploading and processing a PDF that contain comments, pdfreader is unable to handle the request, and my backend node service fails. I'm able to use PdfReader().parseBuffer(file, function(err, item) to process the buffered file, and it's able to read the file and first item, but it fails going forward.

Is this a known bug, and if so, is there anyway I can handle this accordingly, or a way to detect the file has comments and return an error. I've tried some work arounds, but the service just fails every time.

@adrienjoly
Copy link
Owner

Thanks for reporting this issue, Ryan.

In order to find a solution, our community needs a more precise description of the problem (e.g. error message) and, if possible, a way to reproduce the issue. Can you share more details and provide a PDF file that fails to parse with pdfreader, please?

@lauzonryd
Copy link
Author

I was able to diagnose the exact situation in which this is failing. I am attaching two PDFs here. It appears that when a comment is added onto a table, the server crashes. There is no error message, it just stops working completely and the whole server needs restarted. If there was an error message I would be able to handle it accordingly.

testingWithTable.pdf - this PDF fails
testingWithTable-noComment.pdf - this PDF does not fail

testingWithTable.pdf
testingWithTable-noComment.pdf

@adrienjoly
Copy link
Owner

adrienjoly commented Aug 20, 2021 via email

@lauzonryd
Copy link
Author

When attempting to open these pdfs with pdf2json, I'm getting the same results. The server is crashing when a comment is placed within a table, and works as expected without comments.

@adrienjoly
Copy link
Owner

adrienjoly commented Aug 22, 2021 via email

@adrienjoly
Copy link
Owner

FYI, I just published a new version of pdfreader that uses the last version of pdf2json: Release v1.2.11 · adrienjoly/npm-pdfreader

@adrienjoly adrienjoly added the bug label Aug 27, 2021
@lauzonryd
Copy link
Author

Thanks, I ran the update and tried again but the issue persists. I posted an issue in within the pdf2json repo but I haven't gotten any response.

@adrienjoly
Copy link
Owner

For reference, here's the issue you opened on pdf2json's repo: modesty/pdf2json#242

I'm guessing that it was fixed in pdf2json v1.2.5. Unfortunately, that release introduced a breaking change which is not (yet) supported by pdfreader. (see #95)

In you feel like it, feel free to propose a Pull Request to make pdfreader support that version of pdf2json.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants