Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating priority for pdf/x over pdf #110

Closed
andreakb opened this issue Aug 25, 2016 · 4 comments
Closed

Creating priority for pdf/x over pdf #110

andreakb opened this issue Aug 25, 2016 · 4 comments

Comments

@andreakb
Copy link

Hello,
I am looking at the attached file from the govdocs corpus.

272791.pdf

It multi-identifies as fmt/144 (Acrobat PDF/X - Portable Document Format - Exchange 1:1999) and fmt/16 ( Acrobat PDF 1.2 - Portable Document Format), which, looking around, make sense because PDF/X is based on PDF 1.2, and looking at the documentation I could find about later versions of pdf/x

2.3 Identification and conformance
A PDF file is identified by a header, the first line of which is a PDF comment that begins with “%PDF-” and is followed by a version number. While each conformance level requires that files conform to a specific version of PDF, they also state that the version number in the first line of the header is not relevant for determining PDF/X
compliance.

A conforming PDF/X-1a:2003 file is identified by:
a) being a PDF file; and
b) having the GTS_PDFXVersion key in the Info dictionary with a value of (PDF/X-1a:2003)

A conforming PDF/X-2:2003 file is identified by:
a) being a PDF file;
b) having the GTS_PDFXVersion key in the Info dictionary with a value of (PDF/X-2:2003)

A conforming PDF/X-3:2003 file is identified by:
a) being a PDF file;
b) having the GTS_PDFXVersion key in the Info dictionary with a value of (PDF/X-3:2003)

In pronom, fmt/144 is listed as being a subtype as fmt/16, and it looks like the subtype specification is the case for other versions of pdf/x as well. I am wondering if the "is the subtype of" is related to priority at all? Because it seems like (at least in this case), without priority logic in place, a pdf/x will identify as both the pdf version in the header and the pdf/x version specified in the info dictionary .

Thank you!

@Dclipsham
Copy link

Hi Andrea,
Yes, this looks like an oversight on our part. If you look at, for example, PDF/X-1a:2001 - http://www.nationalarchives.gov.uk/PRONOM/fmt/157 we have set the priority over PDF 1.3, and all of these PDF subtypes should have priority over their respective supertypes. I'll correct each of them in the September release.
Subtype doesn't do anything from an ID point of view, it's just a way of representing relationships between formats. We do need the 'has priority over' too

You're ANZ aren't you?

David

@andreakb
Copy link
Author

Thanks for that, David! The subtype explanation helps me, too. I am at ANZ.

@Dclipsham
Copy link

This didn't make September release as I ran out of time to get it through our back end systems, but I'll either do a micro-release in October or roll it in with the larger scheduled November release. Apologies!

@Dclipsham
Copy link

fixed in march release. sorry for the delay

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants