Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

validation fails #618

Closed
arnoudo opened this issue May 5, 2023 · 5 comments
Closed

validation fails #618

arnoudo opened this issue May 5, 2023 · 5 comments
Assignees

Comments

@arnoudo
Copy link

arnoudo commented May 5, 2023

Thank you for developing a shell program with form processing functionality!

The attached file testform.pdf was exported from testform.odt in LibreOffice Writer 7.5.3.2 (latest). The command "pdfcpu form list testform.pdf" returns the following error.

pdfcpu: validateNameEntry: dict=pagesDict entry=Tabs invalid type types.StringLiteral

As a sanity check, I made another file testform2.pdf by adding form fields with Adobe Acrobat Pro 9 (latest version I have). The command "pdfcpu form list testform2.pdf" succeeds. So apparently, the issue is with the PDF file exported from LibreOffice.

OS: Windows 10 Enterprise 22H2 64-bits (19045.2846)
pdfcpu: v0.4.0 dev
build : 2023-03-01T00:15:26Z
commit: 7ff654b

I hope this report may be helpful in the development of your code. Let me know if you need any other information or tests. This is the first time I submit a bug (on Github or anywhere) so apologies if this does not follow the intended format.

testform.odt
testform.pdf
testform2.pdf

@hhrutter
Copy link
Collaborator

hhrutter commented May 6, 2023

Your file does not validate since the page attribute Tab is expected to be a name object.
We will have to extend relaxed validation.

Always ensure your input files pass validation before processing them:
pdfcpu validate input.pdf

@hhrutter hhrutter changed the title pdfcpu form list fails with basic form created in LibreOffice Writer validation fails May 6, 2023
@petervwyatt
Copy link

I found the error and reported to LibreOffice: https://bugs.documentfoundation.org/show_bug.cgi?id=155228

@asciim0
Copy link

asciim0 commented May 10, 2023

@petervwyatt - quick question, if i may. do you consider this pdf error critical?

@petervwyatt
Copy link

The page object /Tabs entry is used for defining the navigation order of annotations on a page. In the case of LibreOffice they define it to be in logical structure order (/S) - I assume because they are writing out Tagged PDF and logical structure. The comment a few lines above the line in question also mentions PDF/UA-1 (ISO 14289-1) which requires /Tab to be /S - so without this correction LibreOffice PDFs will not be compliant to PDF/UA-1!

If we ignore compliance to PDF/UA-1, then if a page has no annotation then it is not critical. But if the page has more than 1 annotation and especially if the PDF is read by end users with disabilities who rely on assistive technology then the /Tabs entry is very important for meaningful navigation of annotations. That is why it is a mandated requirement in PDF/UA-1!

@asciim0
Copy link

asciim0 commented May 15, 2023

thank you very much for that explanation!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants