Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mispositioned U+1C2C LEPCHA VOWEL SIGN E below U+1C37 LEPCHA SIGN NUKTA #1

Closed
dscorbett opened this issue Apr 17, 2022 · 9 comments
Closed
Assignees

Comments

@dscorbett
Copy link

Font

NotoSansLepcha-Regular.otf

Where the fonts came from, and when

Site: https://github.com/googlefonts/noto-fonts/blob/4d7396a4fae475975b367e225f2e8f49292d8837/unhinted/otf/NotoSansLepcha/NotoSansLepcha-Regular.otf
Date: 2022-04-16

Font version

Version 2.002

Issue

U+1C2C LEPCHA VOWEL SIGN E should be positioned below U+1C37 LEPCHA SIGN NUKTA, but that only works in this font below single base letters. Below anything else, like U+25CC DOTTED CIRCLE or the cluster <U+1C00 LEPCHA LETTER KA, U+1C24 LEPCHA SUBJOINED LETTER YA>, the vowel sign overlaps the nukta.

Character data

ᰀ᰷ᰤᰬ◌᰷ᰬ
U+1C00 LEPCHA LETTER KA
U+1C37 LEPCHA SIGN NUKTA
U+1C24 LEPCHA SUBJOINED LETTER YA
U+1C2C LEPCHA VOWEL SIGN E
U+25CC DOTTED CIRCLE
U+1C37 LEPCHA SIGN NUKTA
U+1C2C LEPCHA VOWEL SIGN E

Screenshot

ᰀ᰷ᰤᰬ◌᰷ᰬ

@simoncozens
Copy link
Contributor

simoncozens commented Apr 23, 2022

OK, I've made some progress on this, but this is another of those cross-ligature weirdness issues. Mark-to-mark attachment is working fine for this sequence:

1c00,1c24,1c37,1c2c (ᰀᰤ᰷ᰬ):

shape

But your sequence:

1c00,1c37,1c24,1c2c (ᰀ᰷ᰤᰬ)

(where the two mark glyphs to be mark-to-mark attached are either side of a glyph which takes part in ligation - or perhaps I'm just hyper-aware of this since we've seen this pattern recently) does not attach:

shape

Note that USE does not perform any reordering here and both sequences are accepted without causing dotted circles.

I think we're going to have to go back to the proposal and check which encoding is correct.

@dscorbett
Copy link
Author

The encoding with the nukta before the medial consonant is correct. The Unicode Standard, section 13.12 “Lepcha” specifies the syllable order as <consonant or letter a, nukta, medial -ra, medial -ya, dependent vowel, final consonant sign, syllabic modifier>. It also gives the example of <U+1C00 LEPCHA LETTER KA, U+1C37 LEPCHA SIGN NUKTA, U+1C25 LEPCHA SUBJOINED LETTER RA>.

@simoncozens
Copy link
Contributor

I can't find a way to get mark-to-mark working on this kind of cross-ligature mark situation:

ᰀ᰷ᰬᰀ᰷ᰤᰬ

[uni1C00=0+602|uni1C37=0@-272,0+0|uni1C2C=0@-272,-182+0|uni1C00_1C24=3+840|uni1C37=3@-510,0+0|uni1C2C=3@-510,0+0]

shape

So I think I'm just going throw my hands up and make a ligature of the nukta/vowel e.

@dscorbett
Copy link
Author

The fix is incomplete. For example, here is <U+1C00, U+1C37, U+1C25, U+1C2C> in version 2.003.
ᰀ᰷ᰥᰬ

@simoncozens simoncozens reopened this May 2, 2022
@simoncozens simoncozens transferred this issue from notofonts/noto-fonts Jun 20, 2022
simoncozens added a commit that referenced this issue Sep 6, 2022
@dscorbett
Copy link
Author

The fix is incomplete in version 2.004. Some sequences have have fixed, like <U+1C13, U+1C37, U+1C25, U+1C2C>, because uni1C13_1C25 is now a ligature glyph, but <U+1C00, U+1C37, U+1C25, U+1C2C> is still broken because uni1C00_1C25 is a base glyph. The relevant 'ccmp' lookup only skips ligature glyphs.

@simoncozens
Copy link
Contributor

simoncozens commented Sep 6, 2022

Huh. I wasn't using a ccmp lookup this time but mark-to-mark positioning!

@simoncozens simoncozens reopened this Sep 6, 2022
@simoncozens
Copy link
Contributor

Looks like my previous fix worked purely by chance. The anchors are there but mkmk is not happening; we're getting ligated marks instead.

simoncozens added a commit that referenced this issue Oct 6, 2022
@dscorbett
Copy link
Author

The fix is incomplete in version 2.005. Here is ⟨ᰂ᰷ᰥᰤᰬ⟩ <U+1C02, U+1C37, U+1C25, U+1C24, U+1C2C>:
ᰂ᰷ᰥᰤᰬ

@simoncozens simoncozens reopened this Oct 6, 2022
@simoncozens
Copy link
Contributor

I think I've got this now. The problem is that the cluster order goes below-base vowels, post-base things, and then below-base marks. One approach to this (which the font followed previously) is making the post-base glyphs into marks and filtering them out, but this makes mark attachment (particularly on OS X for some reason) not quite work. So we have to skip over post-base glyphs manually in the code, and there may be several. I don't know how many there can be in Lepcha. I've allowed for three, which allows for ra+ya+a vowel. Lepcha has a fairly simple vowel inventory; I don't think there can be more than one vowel together, so this should be enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants