fontContainsCharacters() returns False for characters > U+FFFF even though the font actually maps them #524

josh-hadley · 2023-04-01T03:36:41Z

What the title says. I suspect the root cause is in CoreText.CTFontGetGlyphsForCharacters possibly not supporting SMP or possibly the call to that function needs different treatment when SMP characters are in the string passed to fontContainsCharacters().

Steps to repro:

Install a font that has known support for SMP characters such as Noto Sans Deseret which supports characters in the range U+10400-1044F
Check drawBot.fontContainsCharacters() for a character in that range and compare to the font's cmap (e.g. fontTools getBestCmap()), something like this:

import drawBot as db
from fontTools.ttLib import TTFont

db.font('NotoSansDeseret-Regular', 48)
fontpath = db.fontFilePath()
ttfont = TTFont(fontpath)
umap = ttfont['cmap'].getBestCmap()
u = 0x10400
c = chr(u)
dbFontContains = db.fontContainsCharacters(c)
inFontCmap = u in umap
print(f'{dbFontContains=}, {inFontCmap=}')

Observe that dbFontContains=False and inFontCmap=True (for U+10400; both are True for BMP characters present in the font such as U+0020 or U+00A0)

Expected behavior:

fontContainsCharacters() should return True for any character that is mapped in the font's most comprehensive Unicode cmap subtable.

The text was updated successfully, but these errors were encountered:

justvanrossum · 2023-04-01T09:28:21Z

Can reproduce. The relevant code is here:

drawbot/drawBot/context/baseContext.py

Lines 1796 to 1805 in 100dbdf

    
               def fontContainsCharacters(self, characters): 
        
                   """ 
        
                   Return a bool if the current font contains the provided `characters`. 
        
                   Characters is a string containing one or more characters. 
        
                   """ 
        
                   font = self._getNSFontWithFallback() 
        
                   if font is None: 
        
                       return False 
        
                   result, glyphs = CoreText.CTFontGetGlyphsForCharacters(font, characters, None, len(characters)) 
        
                   return result

It looks pretty innocent.

I wonder if somehow PyObjC does something wrong when converting the characters argument.

justvanrossum · 2023-04-01T12:58:57Z

After trying a few things, I think it's a PyObjC bug with CTFontGetGlyphsForCharacters. I suspect it has to encode the string as UTF-16, but doesn't.

https://developer.apple.com/documentation/coretext/1510813-ctfontgetglyphsforcharacters?language=objc, and UniChar is a 16bit type.

josh-hadley · 2023-04-01T16:19:39Z

Thanks @justvanrossum and @typemytype for the quick fix!

This was referenced Apr 1, 2023

CTFontGetGlyphsForCharacters() output not correct for chars > U+FFFF ronaldoussoren/pyobjc#546

Closed

Fix fontContainsCharacters() for chars > U+FFFF #525

Merged

typemytype closed this as completed in #525 Apr 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fontContainsCharacters() returns False for characters > U+FFFF even though the font actually maps them #524

fontContainsCharacters() returns False for characters > U+FFFF even though the font actually maps them #524

josh-hadley commented Apr 1, 2023

justvanrossum commented Apr 1, 2023

justvanrossum commented Apr 1, 2023

josh-hadley commented Apr 1, 2023

fontContainsCharacters() returns False for characters > U+FFFF even though the font actually maps them #524

fontContainsCharacters() returns False for characters > U+FFFF even though the font actually maps them #524

Comments

josh-hadley commented Apr 1, 2023

Steps to repro:

Expected behavior:

justvanrossum commented Apr 1, 2023

justvanrossum commented Apr 1, 2023

josh-hadley commented Apr 1, 2023