New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
By removing duplicates, luaotfload messes up codepoints #185
Comments
For the node shaper this is by design: All mappings in the font for the Supplementary Private Use Area A and B (aka everything starting with U+F0000) are ignored and these codepoints are instead used to map all glyphs which don't have another mapping. The codepoints are assigned in GID order and therefore might be completely unrelated to any potential assignments to Supplementary Private Use codepoints in the font. The recommended way to use such fonts is using the HarfBuzz shaper (which specifically ensures that all codepoint assignments from the font are preserved) or by accessing the glyphs through glyphnames (for fonts which have useful glyphnames that is). |
I understand and would like to apologize in advance for the long post below. Unfortunately, this behavior makes it quite difficult to:
Furthermore, in addition to the incompatibility with xelatex, from your explanation it looks like this design decision causes "internal" incompatibility within lualatex itself, so that the same document source produces results that change with the shaper not just with respect to the quality of the font rendering (which is well expected), but in the very semantic of the document (because you end up silently using completely different characters). This may lead to interesting results. For instance, I imagine that it would not be hard to design a font and a document so that the latter in its source form or compiled via xelatex or lualatex+harfbuzz looks quite innocuous, to then contain insulting pictograms (or even text) once compiled by another user with lualatex and the node shaper because of the codepoint replacement. An additional issue is that it may not be possible to use the HarfBuzz shaper as you suggest, because the feature set is different. For instance, the node shaper supports "variable otf" fonts (which one can easily expect to rapidly become popular even for printed docs and not just on the web) while I read that the HarfBuzz shaper will likely not get that ability. With respect to your second suggestion, can you please expand on how to access glyphs via glyphnames from lualatex? Is there anything similar to the \char, \uchar or \symbol command for that? Finally, do you think that it could be possible for the luaotfload developers to introduce some change to the current behavior i.e. rethink the
I really would like to advocate that the behavior made more controllable. The PUA-A and B intended purpose seems to be that they are intentionally left undefined so that third parties may define their own characters without conflicting with Unicode Consortium assignments, not as a fully available working area for the renderer. Consequently, if luatex really cannot do without making some reuse of these areas, IMHO it should really try to minimize their disruption. Not doing so may break any "agreement" on the usage of such areas, be them private to some organization or (more broadly) private as published by some font designers for a specific font or even by some entities such as CSUR (that uses PUA-A and PUA-B) for wider purposes. |
But this here is not a variable font.
Variants of glyphs need a number too.
Currently the glyph names are not in the lua and so not accessible. From a remark on the context list I guess that the names are dropped because the font hasn't named all glyphs.
The code for the node mode is imported from context. I would suggest that you discuss this on the context mailing list, there is already a thread about this font. |
@u-fischer thanks for the details, that was very helpful!
I am aware that the Material Design Icon font is not variable, I was mentioning with a future-proof mind as it is not unlikely that in the future we get variable icon fonts.
Out of curiosity, and if I am not stealing too much time, how does the harf mode handle this?
I will definitely take a look at the context ml. Thanks for the pointer. |
Given the history of multiple master fonts (and Metafonts) I'm not convinced that one can expect that, but we will see. In any case the HarfBuzz shaper probably will get that ability, it's just a bit unclear when. (Basically we need to instantiate fonts. This has to be done either through Lua code or in HarfBuzz and using HarfBuzz functionality would bee more in line with the general approach of the HarfBuzz shaper. Therefore this is waiting for upstream support.)
In
which works as long as the node shaper is used. But the webfont in question does not have glyphnames.
As Ulrike already wrote, this is ConTeXt code so it will be fixed if it gets fixed in ConTeXt.
Basically LuaTeX stores the text internally in a bunch of XeLaTeX works completely different on that level so it isn't really comparable, the harfbuzz shaper gives such glyphs "codepoints" outside of the range of valid Unicode values. (So 0x110000 and higher) This leads to different issues (e.g. it's not so easy to just insert these glyphs directly as with the names glyph code I gave above since |
For those who might be interested, the relevant ConTeXt thread is |
A quick update:
|
It is already merged into
There are two smaller issues which will be fixed with the next ConTeXt upload, therefore we won't do a release before that. But I'm hoping to make a new release soon after that. |
@Zanguin Now that a new release of luaotfload is out (I have just received it via tlmgr) I would like to start experimenting with it. I wonder if you could be so kind to help me (and maybe other readers of this issue) by any pointer to where some documentation about the font tables can be found. I have been unable to find anything about the Thanks for any help and sorry, it this question is not really 100% consistent with the issue it is attached to. |
Anything not documented in the LuaTeX manual, the ConTeXt manual or the luaotfload documentation is considered an internal implementation detail. As far as I am aware, the By the way, there is a documented alternative way to get the codepoint for a glyphname:
There is no |
@zauguin Thanks a lot! I thought there ought to be a local f = fontloader.open('PunkNova.kern.otf')
print (f.fontname)
local i = 0
if f.glyphcnt > 0 then
for i=f.glyphmin,f.glyphmax do
local g = f.glyphs[i]
if g then
print(g.name)
end
i = i + 1
end
end
fontloader.close(f) But it is not completely clear to me if the object returned by the |
... and I do not seem to be very successful with the |
Hi,
It looks like
luaotfload
removes duplicates from fonts, hence breaking the expected correspondence between code-point and actual character (the result is that it introduces an offset). Not only this makes it quite hard to pick the correct character when looking at font glyph tables, it also breaks "cut and paste" (pasting a character in the latex source does not provide that character in the PDF) and breaks compatibility withxelatex
.See the following example using the Material Design Icon Font with lualatex.
The Material Design icon set is made available as a webfont, including a ttf version at https://materialdesignicons.com/ (use the download button to get version 5.4.55). The downloaded file
materialdesignicons-webfont.ttf
declares the font name "Material Design Icons" and when saved at a system accessible place seems to work just fine withxelatex
.However, if I try to use that with luatex as:
then I cannot seem to get the correct characters. For instance F1372 should be mdi-account-details-outline character according to the table in https://pictogrammers.github.io/@mdi/font/5.4.55/ but it turns out as a different character.
The matter is discussed at https://tex.stackexchange.com/questions/596610/how-to-use-luatex-with-large-unicode-codepoint/596626#596626
The text was updated successfully, but these errors were encountered: