Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fontContainsCharacters() returns False for characters > U+FFFF even though the font actually maps them #524

Closed
josh-hadley opened this issue Apr 1, 2023 · 3 comments · Fixed by #525

Comments

@josh-hadley
Copy link

What the title says. I suspect the root cause is in CoreText.CTFontGetGlyphsForCharacters possibly not supporting SMP or possibly the call to that function needs different treatment when SMP characters are in the string passed to fontContainsCharacters().

Steps to repro:

  1. Install a font that has known support for SMP characters such as Noto Sans Deseret which supports characters in the range U+10400-1044F
  2. Check drawBot.fontContainsCharacters() for a character in that range and compare to the font's cmap (e.g. fontTools getBestCmap()), something like this:
import drawBot as db
from fontTools.ttLib import TTFont

db.font('NotoSansDeseret-Regular', 48)
fontpath = db.fontFilePath()
ttfont = TTFont(fontpath)
umap = ttfont['cmap'].getBestCmap()
u = 0x10400
c = chr(u)
dbFontContains = db.fontContainsCharacters(c)
inFontCmap = u in umap
print(f'{dbFontContains=}, {inFontCmap=}')
  1. Observe that dbFontContains=False and inFontCmap=True (for U+10400; both are True for BMP characters present in the font such as U+0020 or U+00A0)

Expected behavior:

fontContainsCharacters() should return True for any character that is mapped in the font's most comprehensive Unicode cmap subtable.

@justvanrossum
Copy link
Collaborator

Can reproduce. The relevant code is here:

def fontContainsCharacters(self, characters):
"""
Return a bool if the current font contains the provided `characters`.
Characters is a string containing one or more characters.
"""
font = self._getNSFontWithFallback()
if font is None:
return False
result, glyphs = CoreText.CTFontGetGlyphsForCharacters(font, characters, None, len(characters))
return result

It looks pretty innocent.

I wonder if somehow PyObjC does something wrong when converting the characters argument.

@justvanrossum
Copy link
Collaborator

After trying a few things, I think it's a PyObjC bug with CTFontGetGlyphsForCharacters. I suspect it has to encode the string as UTF-16, but doesn't.

https://developer.apple.com/documentation/coretext/1510813-ctfontgetglyphsforcharacters?language=objc, and UniChar is a 16bit type.

@josh-hadley
Copy link
Author

Thanks @justvanrossum and @typemytype for the quick fix!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants