Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add author to metadata destroys pdf/a conformance #150

Closed
femifrak opened this issue Dec 7, 2020 · 4 comments
Closed

add author to metadata destroys pdf/a conformance #150

femifrak opened this issue Dec 7, 2020 · 4 comments

Comments

@femifrak
Copy link

femifrak commented Dec 7, 2020

sorry for similar issue like #143
When adding author info to meta data pdf/a conformance is no longer given:

import pikepdf, sys
# using pikepdf v.2.1.2

doc = pikepdf.open(sys.argv[1])

with doc.open_metadata() as meta:
     meta['dc:creator'] = 'Forname Surename'

doc.save('out.pdf')
doc.close()


in.pdf
out.pdf

Or is this not the way to add author info?
Thanks!!

@femifrak
Copy link
Author

femifrak commented Dec 18, 2020

I can imagine that making files pdfa compliant is a thankless and unsatisfying task. Actually, all I care about is that the file will still be readable in 20, 30 years. Basically, I am only marginally interested in the metadata. The main reason for my question is that I don't know of any tool that exclusively tests my long-term readability requirements and disregards the meta-stories. In light of the fact that the metadata is probably a bit difficult to deal with, I'm wondering if I should suggest an option that deletes all (or all possible) metadata, resulting in a file that can be more easily made pdfa-compliant.

What do you think?

(just realized that this issue more belongs to ocrmypdf, sorry)

@jbarlow83
Copy link
Member

Looks like the complaint is that dc:creator needs to be an ordered array of names, not a single name. But I can't find where the spec describes how to copy this information to the document info dictionary where only a single string is accepted (I think).

Unfortunately deleting the metadata won't help, because there is information in the metadata that designates the document as a PDF/A.

@jbarlow83
Copy link
Member

So there's a simple fix: meta['dc:creator'] = ['Forname Surename']

@femifrak
Copy link
Author

Sorry for the late repyly.

Thanks a lot for the solution! This is great as I can use verapdf again now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants