Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use predefined name for the ligature parts #1007

Closed
leiserfg opened this issue May 17, 2021 · 81 comments
Closed

Use predefined name for the ligature parts #1007

leiserfg opened this issue May 17, 2021 · 81 comments

Comments

@leiserfg
Copy link

Hi @be5invis, have you checked this comment from @kovidgoyal?
kovidgoyal/kitty#297 (comment)
It will be pretty nice to have the best monospaced font and the best terminal emulator working together 🙏

@be5invis

This comment has been minimized.

@kovidgoyal
Copy link

How is naming your glyphs sensibly a compromise? You want to do stupid ass things like have infinite length ligatures, at least do it right.

@be5invis
Copy link
Owner

@kovidgoyal If we do this, perhaps Kitty could support .init/.medi/.fina/.isol too. .init/.medi/.fina/.isol are how font makers usually joining letterforms, like that in Arabic.

@kovidgoyal
Copy link

Is there some reason you want to use a different naming scheme
from other fonts that support infinite length ligatures? It would
be nice to have a consistent naming scheme across fonts.

To be precise, glyphs that are part of an infinite ligature are
currently assumed to have the following naming scheme

prefix_(start|middle|end).seq

where prefix is must not contain the _ character. This scheme
is followed by at least Fira Code and Cascadia Code and possibly
MonoLisa, though am not sure of the last.

If you have some reason to want a different naming scheme, I am willing
to implement support for it, though I would urge you to keep it as
simple as possible to avoid making lookups unnecessarily expensive.

@kovidgoyal
Copy link

Here is the function used to detect infinite ligature glyphs in kitty: https://github.com/kovidgoyal/kitty/blob/master/kitty/fonts.c#L810

@be5invis
Copy link
Owner

be5invis commented May 18, 2021

@kovidgoyal One problem with your pattern is that all ligations must follow the "start-middle-end" pattern, while in Iosevka, the glyphs are used in a more complicated pattern. For example, ligation of != is actually a long equal sign with a bar overlay. Why doing this? Because this is the easiest way to support all these language-specific ligation sets.

And sometimes ligations will interact with each other, so making ligations' names follow this pattern will further complicate implementations.

image

I am curious about whether it is possible to have a single suffix to indicate that this glyph is a part of a ligature. Or perhaps we could leverage GDEF's Glyph Class Definition, and glyphs with class 4 will be considered a part of a ligation. This class could be retrieved with hb_ot_layout_get_glyph_class.

Once Kitty identified a continuous sequence of glyphs are all considered "part of ligature", it will render them together.

@kovidgoyal
Copy link

kovidgoyal commented May 18, 2021 via email

@be5invis
Copy link
Owner

It's not possible. Think of two ligatures that are logically separate,
but neighboring. If all the glyphs have the same suffix, then there is
no reasonable way to tell the end of one and the start of the next apart.

Do we need to handle this case separately? If it is rendering-only then combining adjacent ligatures into a long run won't influence much, even for performance, since ligatures do not usually meet.

@kovidgoyal
Copy link

kovidgoyal commented May 18, 2021 via email

@be5invis
Copy link
Owner

be5invis commented May 18, 2021

I do not think neighboring ligatures occur often in real life — for code usages, they are usually operators separated by spaces, and for terminal output, I do not think they are frequent too.

To clarify: if we need to implement the start/middle/end glyph name suffixes I need to rewrite almost all code related to ligation, and it may introduce a lot more doppelganger glyphs (glyphs having the same geometry but different GID/name). Also, it will make shaping slower since we need more OTL lookups to "correct" the glyphs to the properly named one.

For example, the image below shows <~~~~~~~~, and the long tail is actually built up with a single glyph that overlays with the glyph before it. Implementing such a tail will need only one lookup (sub [tail_connect_to_arrow tail_end] tilde' by tail_end), but if we want the glyphs to follow the naming pattern, we will need an extra lookup to "fix" the waves in between (like sub tail_end' tail_end by tail_mid).

image

@kovidgoyal
Copy link

kovidgoyal commented May 18, 2021 via email

@be5invis
Copy link
Owner

See above, you still need only two glyphs, just the first one should have a name ending with _start.seq and the second a name ending with _end.seq

So your implant supports “start-end-end…-end” and “start-start-start-…-end” pattern?

Or the “start” means “this glyph must be rendered with the glyph right after it”, while “end” means “this glyph must be rendered with the glyph right before it”?

@kovidgoyal
Copy link

kovidgoyal commented May 18, 2021

kitty will recognize all of the following:

  1. start, end
  2. start, middle, ... , middle, end
  3. middle, end
  4. middle, ... , middle, end

@be5invis
Copy link
Owner

I'd like to have a behavior that:

  • If a glyph's name ends with _start.seq, then it must be rendered with the (non-mark) glyph right after it, no matter what it is, or how it is named;
  • If a glyph's name ends with _end.seq, then it must be rendered with the (non-mark) glyph right before it, no matter what it is, or how it is named;
  • If a glyph's name ends with _middle.seq, then it must be rendered with the (non-mark) glyph right after and right before it, no matter what they are, or how they are named.

This will allow patterns like start--start--start--end, start--end--end--end or even isolate--end--middle--end. This will simplify the work of font designers a lot.

@be5invis
Copy link
Owner

be5invis commented May 18, 2021

Before\After Isolated Start Middle End
Isolated Separate Separate Join Join
Start Join Join Join Join
Middle Join Join Join Join
End Separate Separate Join Join

@kovidgoyal
Copy link

kovidgoyal commented May 18, 2021

If you really need that much flexibility, then probably better to use a
different naming scheme, because those rules will likely break other fonts. In
particular the requirement that end/middle glyphs always force joins.

So I suggest picking a different suffix instead of .seq, maybe .calt-lig or
even .iosevka.

Then I will have kitty check the glyphs for the === ligature. If one of them
has the suffix .iosevka it will use the rules you want, otherwise it will
use the current FiraCode/CascadiaCode rules.

Also, if you are going to setup such rules, probably would be good to use them
for all ligatures not just infinite length ones. That will make the code to
recognize ligatures much simpler and more robust and will probably make your
code to generate them also simpler and more robust.

@leiserfg
Copy link
Author

leiserfg commented May 18, 2021

I think if that's the way, maybe a more generic suffix (not iosevka) will be better, that way other font designers could be able to use it too (like .liga, .calt-lig, .kitty). And maybe it can be exposed somehow in the font (I'm not sure if fonts can have custom properties) so the === heuristic is not required, that way it could also be used for non-programming fonts.

@kovidgoyal
Copy link

Yes, better to make it into a general specification that can be widely used, but that requires more work :)

@be5invis
Copy link
Owner

@leiserfg @kovidgoyal I am thinking if we need to implement a joining-based semantics, the glyph naming pattern could be .join-r, .join-l and .join-m, indicating whether a glyph will join with the adjacent glyph at right, at left, or at both left and right.

The table is:

Before\After Isolated .join-r .join-m .join-l
Isolated Separate Separate Join Join
.join-r Join Join Join Join
.join-m Join Join Join Join
.join-l Separate Separate Join Join

@kovidgoyal
Copy link

Fine by me, though since the direction of text can be LTR or RTL
probably better to use p (previous) n (next) and b (both)

@be5invis
Copy link
Owner

@kovidgoyal Well, I think l and r are better for font designers, since glyph visuals are, well, visual.

@be5invis
Copy link
Owner

One sidenote: the joining-ness between glyph pair is sort of disjunction, i.e., if the glyph at left has .join-r while the glyph at right is not considered joining, Kitty should still glue them together, because the first glyph indicates it must join with the glyph after it. The table above shows all the cases of adjacent glyph pairs.

@kovidgoyal
Copy link

kovidgoyal commented May 19, 2021 via email

@be5invis be5invis added this to the Backlog milestone May 19, 2021
@kovidgoyal
Copy link

kovidgoyal commented May 19, 2021 via email

@be5invis

This comment has been minimized.

@kovidgoyal
Copy link

kovidgoyal commented May 19, 2021 via email

@be5invis
Copy link
Owner

What if a join-l is followed by another join-l or a join-m?

The ligature will continue. I think you are using mail lost some of my edits, here is the revised conditions of whether a glyph starts or ends a ligature:

Start condition: Either

  • A .join-r, NOT after a .join-r or .join-m;
  • A non-joining, preceding a .join-l or .join-m, and NOT after a .join-r or .join-m;

End condition: Either

  • A .join-l NOT preceding a .join-l or .join-m.
  • A non-joining, after a .join-r or .join-m, and NOT preceding a .join-l or .join-m.

@kovidgoyal
Copy link

kovidgoyal commented May 19, 2021 via email

@be5invis
Copy link
Owner

It means two non-joining glyphs are a ligature.

For non-joining glyphs, it can start a glyph if and only it is right before a (.join-l or .join-m) glyph, and is NOT after a (.join-r or .join-m) glyph. So two adjacent non-joining glyph won't start a ligature.

@kovidgoyal
Copy link

Does dlig need to be turned on explicitly? Is it off by default? And the slash glyph is too wide to fit in the bounding box of the rest of the font, so it is truncated. kitty either truncates or resizes all glyphs that dont fit in a cell, because all cells are rendered in parallel, independently, which is what gives it its performance.

@be5invis
Copy link
Owner

be5invis commented Jun 4, 2021

@kovidgoyal dlig is off by default. To enable it you should explicitly turn this feature on, and turn off calt since they conflict with each other.

Iosevka's slash looks like this, which is pretty... normal. the xMax is 442 which fits in the unit cell.
image

@kovidgoyal
Copy link

Fortunately kitty has the ability to turn on/off features in its config, so here is the output with calt off and dlig on
ligs

Those ligatures still dont work, but other one involving [| do

@be5invis
Copy link
Owner

be5invis commented Jun 4, 2021

@kovidgoyal Well, your image matches my sample pretty well now. Except for slash -- there's definitely something wrong with it.

@leiserfg
Copy link
Author

leiserfg commented Jun 4, 2021

Yes, it should be drawn like a mirror of \ but it's not.

@be5invis
Copy link
Owner

be5invis commented Jun 4, 2021

There are some non-standard tags that enable certain ligatures, including:

JSPT: Enables special === and !==
image

HSKL: Enables /= as inequal
image

FSTA: Enables <> as inequal, === and =!=
image

MTLB: Enables ~= as inequal
image

@kovidgoyal
Copy link

kovidgoyal commented Jun 4, 2021

Its not a ligature with dlig either:

hb-shape --show-extents --cluster-level=1 --shapers=ot --features "calt=0,dlig=1" iosevka-regular.ttf '/='
[u002F=0+500<59,823,382,-966>|u003D=1+500<64,464,372,-248>]

There rest of the features should work fine, kitty doesnt special case anything on font features other than calt. calt based ligatures are disabled when the cursor is over them. Other kinds of ligatures wont be.

As for why / is redering differently, thats my mistake (my kitty config uses a special font for the slash character). With default config its fine.

ligs

@kovidgoyal
Copy link

So only remaining issue is the lack of ligature glyph names for /= and similar even under dlig

@be5invis
Copy link
Owner

be5invis commented Jun 4, 2021

@kovidgoyal Well even in dlig, /= doesn't ligate too. /= is only ligated under non-standard tags targeting Haskell-like languages (since in these languages they really use /= as inequality sign).

@kovidgoyal
Copy link

OK, then as far as I am concerned, this issue can be closed. People that want those extra ligatures can turn on the features in kitty.conf

@be5invis
Copy link
Owner

be5invis commented Jun 4, 2021

@kovidgoyal Could you please kindly test the non-standard tags that I spotted in previous comments? They involve special glyphs and I'd like to validate whether they work correctly.

@kovidgoyal
Copy link

I tried jspt, doesnt seem to have any effect, as you can check for yourself, using hb-shape

hb-shape --show-extents --cluster-level=1 --shapers=ot --features "-calt,+dlig,+jspt" iosevka-regular.ttf '==='
[.gid9172.join-r=0+500<64,464,519,-248>|.gid9173.join-l=1+500<-83,464,519,-248>|.gid9174.join-l=2+500<-147,464,583,-248>]
hb-shape --show-extents --cluster-level=1 --shapers=ot --features "-calt,+dlig,-jspt" iosevka-regular.ttf '==='
[.gid9172.join-r=0+500<64,464,519,-248>|.gid9173.join-l=1+500<-83,464,519,-248>|.gid9174.join-l=2+500<-147,464,583,-248>]

There is no difference, in both cases, the dlig ligature is used. If I use "-calt,-dlig,+jspt" no ligation happens at all.

@kovidgoyal
Copy link

Ah never mind needs to be uppercased

@be5invis
Copy link
Owner

be5invis commented Jun 4, 2021

@kovidgoyal Yeah non-standard tags should be all uppercase letters.

@kovidgoyal
Copy link

Here is -calt +JSPT
ligs

@be5invis
Copy link
Owner

be5invis commented Jun 4, 2021

@kovidgoyal JSPT looked fine. How about HSKL, FSTA and MTLB?

@kovidgoyal
Copy link

kitty --config=NONE -o font_family=Iosevka -o 'font_features Iosevka -calt +HSKL' sh -c "cat /t/iosevka-corpus.txt; echo; cat"
ligs

@kovidgoyal
Copy link

kitty --config=NONE -o font_family=Iosevka -o 'font_features Iosevka -calt +FSKL' sh -c "cat /t/iosevka-corpus.txt; echo; cat"
ligs

@kovidgoyal
Copy link

kitty --config=NONE -o font_family=Iosevka -o 'font_features Iosevka -calt +MTLB' sh -c "cat /t/iosevka-corpus.txt; echo; cat"
ligs

@be5invis
Copy link
Owner

be5invis commented Jun 4, 2021

@kovidgoyal Typo: FSKLFSTA

@kovidgoyal
Copy link

kovidgoyal commented Jun 4, 2021

kitty --config=NONE -o font_family=Iosevka -o 'font_features Iosevka -calt +FSTA' sh -c "cat /t/iosevka-corpus.txt; echo; cat"
ligs

@be5invis
Copy link
Owner

be5invis commented Jun 4, 2021

@kovidgoyal LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants