Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF/A support #980

Closed
andreasrosdal opened this issue Nov 2, 2023 · 7 comments · Fixed by #1084
Closed

PDF/A support #980

andreasrosdal opened this issue Nov 2, 2023 · 7 comments · Fixed by #1084

Comments

@andreasrosdal
Copy link
Contributor

andreasrosdal commented Nov 2, 2023

A new PdfADocument and PdfAWriter for generating PDF/A compliant documents seems useful.

Suggest to create unit tests using veraPDF to validate PDF files created by OpenPDF.

https://github.com/veraPDF/veraPDF-validation

https://en.wikipedia.org/wiki/PDF/A

@andreasrosdal
Copy link
Contributor Author

I have created a quick initial test case which validates PDF files generated by OpenPDF here:
https://github.com/LibrePDF/OpenPDF/blob/master/openpdf/src/test/java/com/lowagie/text/validation/PDFValidationTest.java

Currently it finds 5 validation errors:

Validation errors: 5
TestAssertion [ruleId=RuleId [specification=ISO 19005-1:2005, clause=6.8.2.2, testNumber=1], status=failed, message=The document catalog dictionary shall include a MarkInfo dictionary with a Marked entry in it, whose value shall be true., location=Location [level=CosDocument, context=root], locationContext=null, errorMessage=null]
TestAssertion [ruleId=RuleId [specification=ISO 19005-1:2005, clause=6.7.3, testNumber=7], status=failed, message=The value of Producer entry from the document information dictionary, if present, and its analogous XMP property pdf:Producer shall be equivalent., location=Location [level=CosDocument, context=root/trailer[0]/Info[0]], locationContext=null, errorMessage=null]
TestAssertion [ruleId=RuleId [specification=ISO 19005-1:2005, clause=6.8.3.3, testNumber=1], status=failed, message=The logical structure of the conforming file shall be described by a structure hierarchy rooted in the StructTreeRoot entry of the document catalog dictionary, as described in PDF Reference 9.6, location=Location [level=CosDocument, context=root/document[0]], locationContext=null, errorMessage=null]
TestAssertion [ruleId=RuleId [specification=ISO 19005-1:2005, clause=6.7.2, testNumber=1], status=failed, message=The document catalog dictionary of a conforming file shall contain the Metadata key., location=Location [level=CosDocument, context=root/document[0]], locationContext=null, errorMessage=null]
TestAssertion [ruleId=RuleId [specification=ISO 19005-1:2005, clause=6.5.3, testNumber=2], status=failed, message=An annotation dictionary shall contain the F key. The F key’s Print flag bit shall be set to 1 and its Hidden, Invisible and NoView flag bits shall be set to 0, location=Location [level=CosDocument, context=root/document[0]/pages[0](4 0 obj PDPage)/annots[0](1 0 obj PDAnnot)], locationContext=null, errorMessage=null]

@andreasrosdal
Copy link
Contributor Author

@Lonzak @bsanchezb @netmackan Are you able to assist here please?

@bsanchezb
Copy link
Contributor

Adding @mkl-public to the loop as well

@Lonzak
Copy link
Contributor

Lonzak commented Nov 2, 2023

Can you use PDFAFlavour.PDFA_1_B; and try again?

@andreasrosdal
Copy link
Contributor Author

andreasrosdal commented Nov 2, 2023

PDFAFlavour.PDFA_1_B returns these 3 Validation errors:

TestAssertion [ruleId=RuleId [specification=ISO 19005-1:2005, clause=6.7.3, testNumber=7], status=failed, message=The value of Producer entry from the document information dictionary, if present, and its analogous XMP property pdf:Producer shall be equivalent., location=Location [level=CosDocument, context=root/trailer[0]/Info[0]], locationContext=null, errorMessage=null]
TestAssertion [ruleId=RuleId [specification=ISO 19005-1:2005, clause=6.7.2, testNumber=1], status=failed, message=The document catalog dictionary of a conforming file shall contain the Metadata key., location=Location [level=CosDocument, context=root/document[0]], locationContext=null, errorMessage=null]
TestAssertion [ruleId=RuleId [specification=ISO 19005-1:2005, clause=6.5.3, testNumber=2], status=failed, message=An annotation dictionary shall contain the F key. The F key’s Print flag bit shall be set to 1 and its Hidden, Invisible and NoView flag bits shall be set to 0, location=Location [level=CosDocument, context=root/document[0]/pages[0](4 0 obj PDPage)/annots[0](1 0 obj PDAnnot)], locationContext=null, errorMessage=null]

So it would be interesting to find solutions to these validation errors.

@Lonzak
Copy link
Contributor

Lonzak commented Nov 2, 2023

PDF/A-1a is for tagged PDFs which needs special handling. So for this test I think PDF/A-1b is (more) correct.

I tested to fix the first message (which is relatively easy) however then I asked myself what the goal is.
I think in a later *text version there is a PDFA-Generator which specifically aims to generate a PDF/A compliant PDF. So maybe we should think about a new PdfADocument class or to specify the type of PDF in the constructor of the Document class. But then it will be much more difficult (=effort) to always generate correct PDF/A documents...

@andreasrosdal andreasrosdal changed the title Unit tests using veraPDF for validating PDF files created by OpenPDF Unit tests using veraPDF for validating PDF files created by OpenPDF (PDF/A) Nov 2, 2023
@andreasrosdal
Copy link
Contributor Author

andreasrosdal commented Nov 2, 2023

A new PdfADocument and PdfAWriter for generating PDF/A compliant documents seems useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants