Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CIDSet, again #811

Closed
a20god opened this issue Jun 22, 2017 · 7 comments
Closed

CIDSet, again #811

a20god opened this issue Jun 22, 2017 · 7 comments
Assignees

Comments

@a20god
Copy link

a20god commented Jun 22, 2017

Here's a document that has a TrueType font with a CIDSet which contains 0x2000 octets of 0xff, that is, the first 65536 bits are set. How can this violate clause 6.2.11.4.2 of ISO 19005-2:2011? As CIDs must not exceed 65535, there cannot be a CID for which its bit in that CIDSet is not set.

6.2.11.4.2-todo-1.pdf

@bdoubrov
Copy link
Contributor

It's been also quite some discussion at the Validation TWG on this CIDSet clause. The decision is that CIDSet shall identify exactly the glyphs present in the subset, but nothing else. So, in your example, CIDSet puts bit 1 to the glyphs which do not exist in the subset.

Note that "empty glyphs" in TrueType font are also considered as present, because, for example, these are all valid space glyphs.

@a20god
Copy link
Author

a20god commented Jun 27, 2017

That is, all bits from the first one (for CID 0) up to and including that for the highest CID that has a GID that is is less than or equal to the highest GID in the font must be one and all other bits coming after them must be zero. In other words, CIDSet could be semanticaly replaced by a single integer.

@bdoubrov
Copy link
Contributor

In case of an embedded TrueType subset, I believe yes. Note that this is different between PDF/A-1 and PDF/A-2(3). In PDF/A-1 the only requirement is that this /CIDSet entry is present. In PDF/A-2 it becomes optional, but if it is present, it shall satisfy the above requirement.

@a20god
Copy link
Author

a20god commented Jul 5, 2017

For future reference: What I wrote above about CIDSet having to look like 11111...00000 (binary) is wrong as I (again) didn't take CIDToGIDMap into account (see https://github.com/veraPDF/veraPDF-validation-profiles/issues/147). Sorry.

@bdoubrov
Copy link
Contributor

Looks like the latest version of veraPDF (1.16) does report this this file is valid. Need to recheck this issue.

@bdoubrov bdoubrov reopened this Jul 30, 2020
@bdoubrov bdoubrov assigned MaximPlusov and unassigned bdoubrov Jul 30, 2020
@bdoubrov
Copy link
Contributor

Looks like the issue is in the value of the BaseFont entry, which is different for the Type0 font in question and for the descendant CIDFont. Namely, Type0 font name does contain the subset prefix, but the CIDFont doesn't. So, veraPDF assumes that the CIDFont is not a subset and does not check the CIDSet entry.

To be discussed at the TWG.

@MaximPlusov MaximPlusov assigned bdoubrov and unassigned MaximPlusov Feb 18, 2021
@bdoubrov
Copy link
Contributor

The subset prefix is only important for CIDFonts, as specified in Table 117 of PDF 1.7 specification. For Type0 fonts BaseFont does not play any special role as indicated in the Note of Table 121 of PDF 1.7 specification. So, verPDF behavior is correct in this case

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants