Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ligatures interupts colors #153

Closed
Udi-Fogiel opened this issue Sep 25, 2023 · 6 comments
Closed

ligatures interupts colors #153

Udi-Fogiel opened this issue Sep 25, 2023 · 6 comments

Comments

@Udi-Fogiel
Copy link
Contributor

consider the following document:

\fontfam[lm]

ff{\Blue i}
\bye

as you can see, the i does not get colored. The reason is quite obvious, the color mechanism sees the ligature only at ship-out time, there for it cannot break the ligature (like implementations using whatsits do). I don't know if there is a simple fix for that (maybe the ligature forming process should be colors aware?), but it might be worth documenting this until, or if it will be changed.

Here is an example where the ligature is interrupted:

\fontfam[lm]

ff\hbox{}{\Blue i}
\bye
@olsak
Copy link
Owner

olsak commented Sep 26, 2023

Thank you for noticing this. We declare it as a feature that ligatures cannot fall apart when there is a color change inside them. I'll try to formulate this in the documentation. The reason is: we want to keep simplicity and a solution of this issue goes against this principle. User have to break the the ligature by own doing, for example by ff\null {\Blue i} or {\Blue ff}\null i.

@Udi-Fogiel
Copy link
Contributor Author

Udi-Fogiel commented Sep 28, 2023

Rereading my ticket, I wasn't very clear. I agree that colors should not break ligatures. I was wondering if there is any way to color a glyph only partially if it is part of a ligature, maybe it is possible to pass the information about colors to the ligatures forming process.

But, I mainly opened this ticket for the documentation, so I'll close. Thank you.

@olsak
Copy link
Owner

olsak commented Sep 29, 2023

Your question is: is it possible to color a letter partly? The "letter" can be a ligature or a common letter.

We can do this by PDF primitive for clipping path. The following code sets the red color to the A letter only partly:

 \noindent 
\pdfliteral{q 0 0 4 100 re W n}\rlap{A}\pdfliteral{Q q 1 0 0 rg 4 0 10 100 re W n}\rlap{A}\pdfliteral{Q}\kern7pt next text.

@vlasakm
Copy link
Contributor

vlasakm commented Sep 29, 2023

Just for future reference:

luacolor package, which does the same coloring in LaTeX explicitly documents the limitation: http://mirrors.ctan.org/macros/latex/contrib/luacolor/luacolor.pdf#subsection.1.3.

luaotfload, which also allows coloring, allows the user to choose at which callback will the coloring be applied (and defaults to post_linebreak_filter): http://mirrors.ctan.org/macros/luatex/generic/luaotfload/luaotfload-latex.pdf#section*.8.

In theory, color is a different style, so it should prevent ligatures, similarly to for example how a switch to bold or italic would. However as mentioned in the linked article, sometimes color is special cased, because naturally we perceive it differently, especially with latin scripts. Here is a more complex example that has no nice solutions: https://faultlore.com/blah/text-hates-you/#style-can-change-mid-ligature.

The technical reason why color isn't taken into account while ligaturing is, that we use LuaTeX attributes, and no code takes our color attribute into account. We maybe could implement colors differently, and force the split of text runs at color changes (e.g. font changes do), but as hinted above, it may not even be what we want.

I don't know what brought you to investigate the issue, but question at Stack Exchange seems interestingly related: https://tex.stackexchange.com/questions/477143/losing-ligatures-when-switching-font-series-or-color-in-the-middle-of-a-word.

@Udi-Fogiel
Copy link
Contributor Author

We can do this by PDF primitive for clipping path. The following code sets the red color to the A letter only partly:

 \noindent 
\pdfliteral{q 0 0 4 100 re W n}\rlap{A}\pdfliteral{Q q 1 0 0 rg 4 0 10 100 re W n}\rlap{A}\pdfliteral{Q}\kern7pt next text.

Thanks for the suggestion, I always appreciate seeing how you use literarls. I guess that if the ligatures are formed by luaotfloade, I'll have to add some code to the pre_linebreak_filter, but after luaotfload to modify the ligatures (sadly not all hebrew ligatures are unicode characters).

luacolor package, which does the same coloring in LaTeX explicitly documents the limitation: http://mirrors.ctan.org/macros/latex/contrib/luacolor/luacolor.pdf#subsection.1.3.

Yes, this is mostly why I opened the ticket, I was just surprised that this fact wasn't documented in OpTeX as well.

In theory, color is a different style, so it should prevent ligatures, similarly to for example how a switch to bold or italic would. However as mentioned in the linked article, sometimes color is special cased, because naturally we perceive it differently, especially with latin scripts. Here is a more complex example that has no nice solutions: https://faultlore.com/blah/text-hates-you/#style-can-change-mid-ligature.

Very interesting, Thanks!

I don't know what brought you to investigate the issue

see https://tex.stackexchange.com/a/699207/264024 for an example. In hebrew, although punctuation marks never overlap, or connected to the base character, it is often combined into a ligature with the base letter to correct the positioning.

By the way, @vlasakm if you will read the linked post, do you know what is the meaning of char number outside the unicode range? I know that at this stage, luaotfload can assign nodes glyph ID's instead of unicode, and that these numbers can depend on whether you use harfbuzz or the default shaping method and maybe even the font, but I did not understand how these numbers are calculated.

@vlasakm
Copy link
Contributor

vlasakm commented Nov 3, 2023

By the way, @vlasakm if you will read the linked post, do you know what is the meaning of char number outside the unicode range? I know that at this stage, luaotfload can assign nodes glyph ID's instead of unicode, and that these numbers can depend on whether you use harfbuzz or the default shaping method and maybe even the font, but I did not understand how these numbers are calculated.

In ConTeXt font processing code (i.e. what luaotfload calls the node shaper) they stick with Unicode before and after shaping. They use one of the Unicode private areas to map to Unicode even glyphs that have glyph id, but no Unicode codepoint (e.g. ligatures, which becomes intersting with e.g. ff which has a code point in Unicode). I am not sure how exactly the numbers are calculated. But since they correspond to glyph ids I would guess that 1:1 mapping (i.e. glyph_id + 0xF0000 to map to the plane 15 private use area would be the most straightforward. Or assigning the private use area code points to glyphs in the order they are encountered.

In your code you use luaotfload with harfbuzz (the harf shaper), hence the situation is different, and apparently code points "outside of Unicode" are assigned to glyphs not directly corresponding to Unicode code points.

See:

latex3/luaotfload#198
latex3/luaotfload#185
https://www.pragma-ade.nl/general/manuals/fonts-mkiv.pdf (search for e.g. private , note the trailing space)

Do any of these functions luaotfload manual, section 11.2.1 Font Properties help with the (reverse) mapping? I am getting confused by the slot / gid names and not sure if it is relevant.

Anyways, this seems out of my area of expertise, I suggest luaotfload github or in specific cases the ntg-context mailing list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants