Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of /CMapVersion to reflect incompatible changes #14

Closed
lrosenthol opened this issue Mar 3, 2023 · 4 comments
Closed

Use of /CMapVersion to reflect incompatible changes #14

lrosenthol opened this issue Mar 3, 2023 · 4 comments

Comments

@lrosenthol
Copy link

Apparently 22 CMaps were modified in https://github.com/adobe-type-tools/cmap-resources/pull/13/files (merged on Jan 15, 2023) adding new characters from Unicode 13. However, Supplement numbers were not increased. Instead, the /CMapVersion entry in the CMap dictionary was modified.

For example, in https://github.com/adobe-type-tools/cmap-resources/pull/13/files#diff-d97d7ad5249e3d999019d33bcacea204e2cb4639b12be8e1159aa7ea32452d42

/CMapVersion* was increased from 1.026 (decimal value type!) to 1.027

This entry is mentioned in Adobe Tech Note 5014 as optional without specifying its expected value type. It has not been used previously to reflect incompatible changes to CMap dictionaries. As such, the use of /CMapVersion* is not documented anywhere in the PDF (ISO 32000-1,2) or PDF/A (ISO 19005-1,2,3,4) or PDF/X(ISO 15930-1a,3,4,5,6,7,8,9) standards.

So, I have a feeling we have a potential interoperability issue here. If today somebody starts using newly added characters in generated PDFs, the processors that have not yet updated the CMaps would not understand them. And there is no place in PDF specifying which version of CMap was used.

Can the changes from #13 be backed out and redone using the update methods have been existing for 25+ years?

@punchcutter
Copy link
Contributor

The Supplement isn't increased because there were no new characters added. Only mappings were changed from CIDs having no Unicode assigned to having Unicode assigned.

@punchcutter
Copy link
Contributor

https://github.com/adobe-type-tools/Adobe-Japan1 under CMap Resources states

In general, the CMap resources that are based on legacy encodings, such as Shift-JIS, are no longer being updated. Rather, the Unicode CMap resources—available for UTF-8, UTF-16 (UTF-16BE), and UTF-32 (UTF-32BE) encodings, and kept perfectly synchronized—are updated on a regular basis, with new mappings being triggered by a new Supplement or a new version of Unicode.

In this case the new mappings were triggered by the latter.

@lrosenthol
Copy link
Author

The Supplement isn't increased because there were no new characters added. Only mappings were changed from CIDs having no Unicode assigned to having Unicode assigned.

But doesn't that imply that the same CID that is mapped against the newer version of CMap will produce a different result? If so, that is the definition of incompatible!

Please reopen @punchcutter

@MatthiasValvekens
Copy link

Devil's advocate: YMMV, but I can't recall seeing a PDF file relying on any kind of "implicit" Unicode-to-CID CMap for rendering purposes (as in: text in content streams using literal UTF-16 codepoints to address a (non-embedded) font). I've seen this a lot with the various legacy encodings, sure, but that's not what's at issue here (since those CMaps aren't updated anymore).

If I wanted to use the latest and greatest Unicode version while ensuring backwards compat, I'd have to embed (and probably subset) the font anyhow, at which point a non-identity CMap offers pretty much no advantage anymore. So do we really expect this change to cause trouble in the field?

On a more theoretical note, I don't think ISO 32000-2:2020, 9.7 pins its CMaps to any particular Unicode version (which may or may not be a bug). AFAICT that clause only talks about compatibility requirements for the underlying character collections, not the mappings into them from various encodings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants