Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Balinese shaping issues #22

Closed
jungshik opened this issue Nov 18, 2015 · 36 comments
Closed

Balinese shaping issues #22

jungshik opened this issue Nov 18, 2015 · 36 comments

Comments

@jungshik
Copy link

Spun off from notofonts/noto-fonts#543

@kmansourMT gave us some test strings he used. I shaped them with harfbuzz and used two versions of Noto Sans Balinese : the current version with both DFLT and non-DFTL OT tables) and my local version with ONLY DFLT.

The latter looks better, but both have shaping issues. I'll add more details shortly.

We need to sort this out.

/cc @roozbehp @behdad

@jungshik
Copy link
Author

I made a test file based on the VOLT screenshots sent by @kmansourMT.

test.uni.txt : test input (\u-escaped)

test.txt : test input in UTF-8

Left is the rendered result with the current (1.03: both DFLT and non-DFLT) font and right is the result (DFLT-only). When non-DFLT (Balinese) is present, harfbuzz routes it to USE.

image

@jungshik
Copy link
Author

As you can see in the screenshot, both versions of the font have issues but often they have different issues.

@kmansourMT : The first syllable in the first line has U+1B13 U+1B00 U+1B38. Is it correct to have U+1B00 between U+1B13 (consonant) and U+1B38 (dependent vowel)? I couldn't find any information as to where in a syllable U+1B00 should go. If it's like signs in other Brahmi-derived scripts, it should go after a vowel, shouldn't it?

I'm gonna try it on Windows 10 + Edge. BTW, what layout engines were used to test Balinese?

@jungshik
Copy link
Author

  • Line 1 : balinese-volt-proofing.PNG

    • Old:
      • 1st : U+1B13 U+1B00 U+1B38
        : USE considers it invalid to have U+1B38 (a vowel sign) after U+1B00
    • New:
      • 4th: U+1B40 (2-part vowel sign) : the left part is not reordered wrt
        a consonant (U+1B19)
      • 5th: U+1B1A, U+1B3F : U+1B3F (left vowel sign) is not reordered wrt
        a consonant (U+1B1A)

    Note that reordering is done with 'dist' in GPOS.

  • Line 2 : balinese-volt-proofing.PNG - No issue

  • Line 3 : Balinese_OT_final.png

    • Both

      • 2nd and 3rd : The 2nd part of U+1B3C and U+1B3D (1B3C.blw) are not put below
        the subjoined form of U+1B13. Instead, it overlaps the subjoined form of
        U+1B13. It's supposed to work with mark positioning (with anchor)

      Note that a one-part vowel sign (U+1B38 in the 1st ) is placed correctly
      below the subjoined form of U+1B13 in the 1st syllable.

  • Line 4: Bal_VOLT_reorder-final1.PNG

    • Old

      • 2nd and 3rd : They have two dependent vowel signs in a row -
        U+1B3[68], U+1B3E. They appear to be considered illegal by USE.

      @kmansourMT : I haven't seen this kind of sequence before in other
      scripts. What led you to test these sequences?

    • New

      • All 5 syllables: U+1B3E or the left part of U+1B40 is not reordered wrt
        a preceding consonant/conjunct and is placed incorrectly.
  • Line 5: Bal_VOLT_reorder-final2.PNG

    • Old
      • 1st: U+1B3E is misplaced (not reordered) after a conjunct.
      • 2nd: Ditto with U+1B3F
      • 3rd: Ditto with U+1B3E
    • New
      • U+1B3E, U+1B3F are misplaced ('partly reordered'). Instead of being
        placed at the left of 'a conjunct' (rendered with two glyphs),
        they're place between two glyphs for a conjunct.
  • Line 6: Bal_VOLT_reorder-final3.PNG

    • Old : the same issue with Line 5. U+1B3E is misplaced (not reordered)
      w.r.t. a conjunct.
    • New : the same issue with Line 5. U+1B3E is misplaced ('partly reordered').
      They're placed between two glyphs for a conjunct.
  • Line 7: bal_1B1B-alt.PNG

    • Both
      • 2nd: U+1B3E is not reordered (misplaced) w.r.t. a conjunct
      • 3rd: This seems to be correct, but the Volt screenshot given by
        @kmansourMT does not have a glyph for U+1B00 (Ulu Ricem).
  • Line 8 and line 9 : Matches the volt screenshots
    bal_marks-on-conjuncts.PNG and mark-musicsymbols.PNG

@jungshik
Copy link
Author

@kmansourMT : I have two questions in the above comment:

  1. Is "U+1B13 U+1B00 U+1B38" in Line 1 a valid sequence? Can U+1B00 come between a consonant and a dependent vowel? USE seems to consider it illegal. ( balinese-volt-proofing.PNG )
  2. Line 4: Can two dependent vowels come in a row? ( Bal_VOLT_reorder-final1.PNG )

@jungshik
Copy link
Author

Below is the screenshot of IE 11 on Windows 10. The current version works better, but is not perfect.
The two questions I asked in the previous comment are still applicable.

image

@jungshik
Copy link
Author

This is Edge on Windows 10 rendering http://jungshik.github.io/noto/balinese/test.html.
Interestingly, Edge's rendering/shaping is different from that of IE 11 on Windows 10. Edge is closer to harfbuzz (and browsers using harfbuzz - Chrome and Firefox) but is still different from harfbuzz.

screen shot 2015-11-19 at 10 13 38 am

@jungshik
Copy link
Author

/cc @tiroj

@jungshik
Copy link
Author

I updated http://jungshik.github.io/noto/balinese/test.html to have links to Volt proofing images from @kmansourMT

@roozbehp
Copy link

I just looked at Line 1 and Line 4. The order of the characters in both cases should be changed. See the USE spec at https://www.microsoft.com/typography/OpenTypeDev/USE/intro.htm:

For Line 1, USE expects Bindus (VM) to appear after vowels. So the test string should be <U+1B13, U+1B38, U+1B00> instead.

For Line 4, USE excepts left-side vowels (VPre) to appear before above vowels (VAbv) and right-side vowels (VPst). So the test string should be <U+1B13, U+1B3E, U+1B36>etc.

@jungshik
Copy link
Author

Thanks, @roozbehp. I updated my test page. Line 1a and line 4 have updated sequences per @roozbehp's comment and they're shaped well.

Back to @kmansourMT : @roozbehp told me that left-vowel sign should come before other vowels (when two vowels come in a row) unless they form a 'diphthong'.

Have you seen <U+1B36, U+1B3E> and other pairs (where left-vowel sign such as U+1B3E or U+1B3F is put after another non-left vowel sign) in your test (lines 4) form a diphthong? Is that why you're testing those sequences (the 2nd and 3rd syllables in http://jungshik.github.io/noto/balinese/Bal_VOLT_reorder-final1.PNG )?

@roozbehp : how does USE handle 'diphthong' cases (where visually 'left vowel sign' should come after another vowel sign )?

@jungshik
Copy link
Author

/cc @NorbertLindenberg

BTW, a test adopted from @NobertLindenberg's (#163) shows that reordering with U+1B3[EF], U+1B4[01], etc does not happen with harfbuzz when non-DFLT opentype tables are dropped. That is, reordering with dist in DFLT does not work in harfbuzz.
OTOH, Edge on Win 10 seems to be fine with or without non-DFLT table.

See the screenshots below taken of http://jungshik.github.io/noto/balinese/left_vowel.html
(top in each cell uses Noto Sans Balinese with both DFLT and non-DFLT and bottom in each cell uses
the font without non-DFLT).

  • Chrome/Firefox/harfbuzz
    screen shot 2015-11-19 at 4 47 22 pm
  • Edge on Windows 10
    screen shot 2015-11-19 at 4 48 30 pm

@kmansourMT
Copy link

Regarding:
<<Have you seen <U+1B36, U+1B3E> and other pairs (where left-vowel sign such as U+1B3E or U+1B3F is put after another non-left vowel sign) in your test (lines 4) form a diphthong? Is that why you're testing those sequences (the 2nd and 3rd syllables in http://jungshik.github.io/noto/balinese/Bal_VOLT_reorder-final1.PNG )? >>

At the time of testing, our designer did not know details about vowel ordering to such an extent, but was primarily interested in verifying the reordering.

@roozbehp
Copy link

@roozbehp : how does USE handle 'diphthong' cases (where visually 'left vowel sign' should come after another vowel sign)?

I'm not sure I understand the question. But I try to answer anyway:

a) If the question is what happens if there's a left-side "I" vowel and a right-side "E" vowel and the language uses both to represent the "EI" linguistic diphthong and it's written visually as <I><CONSONANT><E>, USE basically says ignore the actual linguistic order and encode it is <CONSONANT><I><E>.

b) If the question is how one would represent the visual sequence <CONSONANT><I><E> where "I" is a left-side vowel and "E" is a right-side vowel, I've yet to see such a sequence on paper. But it indeed exists, one needs to bring it to the attention of Unicode and OpenType Layout communities.

@jungshik
Copy link
Author

@roozbehp Thank you. My question was a) and you answered the question although I'd not like having to type <consonant><I><E> for <consonant><EI> if I were a speaker of such a hypothetical language ;-). Well, it's not likely that there's such a language-script pair.

@jungshik
Copy link
Author

At the time of testing, our designer did not know details about vowel ordering to such an extent,
but was primarily interested in verifying the reordering.

Ok. Are there any other strings in http://jungshik.github.io/noto/balinese/test.html that you know are not valid? We know that notofonts/khmer#10 (the 1st syllable) is invalid.

@jungshik
Copy link
Author

To sort out font issues and potential harfbuzz issues, I'm putting up the shaping result by Edge on Win 11 (with the latest test page http://jungshik.github.io/noto/balinese/test.html ) and Chrome (harfbuzz):

  • Edge on WIndows 10:
    screen shot 2015-11-20 at 3 21 30 pm
  • Chrome (harfbuzz : a few day old harfbuzz trunk matches what I got from chrome):
    screen shot 2015-11-20 at 3 24 33 pm

@kmansourMT
Copy link

Had I known of these errors, I would have told you.

From: jungshik <notifications@github.commailto:notifications@github.com>
Reply-To: googlei18n/noto-fonts <reply@reply.github.commailto:reply@reply.github.com>
Date: Friday, 20 November 2015 at 15:36
To: googlei18n/noto-fonts <noto-fonts@noreply.github.commailto:noto-fonts@noreply.github.com>
Cc: Kamal Mansour <kamal.mansour@monotype.commailto:kamal.mansour@monotype.com>
Subject: Re: [noto-fonts] Balinese shaping issues (#572)

At the time of testing, our designer did not know details about vowel ordering to such an extent,
but was primarily interested in verifying the reordering.

Ok. Are there any other strings in http://jungshik.github.io/noto/balinese/test.html that you know are not valid? We know that #1https://github.com/googlei18n/noto-fonts/issues/1 (the 1st syllable) is invalid.


Reply to this email directly or view it on GitHubhttps://github.com/googlei18n/noto-fonts/issues/572#issuecomment-158557831.

@jungshik
Copy link
Author

Given the result shared in https://github.com/googlei18n/noto-fonts/issues/572#issuecomment-158246038 ( <consonant> <left-vowel> reordering does not work at all with non-DFLT tables dropped in harfbuzz), I'll focus on the results obtained with both DFLT and non-DFLT tables present. (harfbuzz will use USE in that case for Balinese).

Line 1a: USE on Windows matches harfbuzz. Everything seems all right (except for the position of an 'above-base mark' related to #21 )

Line 2: ditto

Line 3: 2nd and 3rd syllables have two 'below-base marks' overlapping each other. (related: #20) in both Windows USE and harfbuzz

Line 4a

  • USE : everything looks ok
  • hb: 4th-syllable : left-vowel position is wrong (no reordering)

Line 5

  • USE: ok
  • hb gets the positions of three left-vowel wrong (no reordering)

Line 6

  • USE: U+1B3E is partially reordered. It's positioned between two glyphs of a consonant cluster.
  • hb: U+1B3E is to the right of two glyphs for a consonant cluster: no ordering

Line 7: hb fails to reorder U+1B3E (left-vowel) wrt a conjunct in the 2nd syllable

Line 8 and 9 : Both USE and hb are fine.


@behdad, can you take a look at the following three? Thanks. (the test page 👍 http://jungshik.github.io/noto/balinese/test.html ).

  • For lines 4a, 5 and 7, hb disagrees with USE on Windows and USE is correct afaict. So, it's likely to be a harfbuzz issue. I'll file a bug against harfbuzz.
  • Lines 3: Both USE and harfbuzz have the same bad shaping result. It's not yet known whether it's a font or shaping engine.
  • Line 6: USE and harfbuzz do not agree, but both fail to reorder. Again, it can be either a font issue or an engine issue.

@behdad
Copy link

behdad commented Nov 21, 2015

@behdad, can you take a look at the following three? Thanks. (the test page 👍 http://jungshik.github.io/noto/balinese/test.html ).

For lines 4a, 5 and 7, hb disagrees with USE on Windows and USE is correct afaict. So, it's likely to be a harfbuzz issue. I'll file a bug against harfbuzz.

Lines 3: Both USE and harfbuzz have the same bad shaping result. It's not yet known whether it's a font or shaping engine.

Line 6: USE and harfbuzz do not agree, but both fail to reorder. Again, it can be either a font issue or an engine issue.

@jungshik please file a harfbuzz bug with just enough details to reproduce it. Jonathan and I will take a look at them in London in December. Thanks

@jungshik
Copy link
Author

The first issue in my previous comment (lines 4a, 5 and 7) was filed against harfbuzz as shown above.

The 2nd and the 3rd issues are tricky because it's not clear who to blame, font or engine.

@kmansourMT, have you ever gotten the expected shaping result with any engine for line 3 and line 6?

@NorbertLindenberg
Copy link

Jungshik, I assume wherever you said "Javanese" in this issue, you really meant Balinese?

@NorbertLindenberg
Copy link

Is there any documentation from Microsoft on how the script-specific and DFLT feature lists interact with script engines in OpenType? I haven't been able to find such documentation. In particular, once the renderer has identified a run of, say, Balinese characters:

– If the renderer doesn't have a script engine supporting Balinese, does it still use feature lists for script "bali", or are those ignored and "DFLT" used instead?

– If the renderer does have a script engine supporting Balinese (the USE), but the font doesn't have feature lists for the script "bali", does the renderer still use the script engine supporting Balinese (the USE), or does it use a fallback engine that doesn't know anything about reordering Balinese vowels?

Answers to these questions might help me understand some of the behavior seen here.

@NorbertLindenberg
Copy link

Sequences of two are more dependent vowels are allowed by the USE and worth testing. Some sequences occur in real life: 1B3A 1B35, 1B3C 1B35, 1B3E 1B35, 1B3F 1B35, 1B42 1B35. As Roozbeh said, the USE requires them to appear them in a specific order.

@jungshik
Copy link
Author

Sequences of two are more dependent vowels are allowed by the USE and worth testing.

Line 4a has a couple of cases for that. Line 4 has the order reversed and didn't work with USE/harfbuzz. Line 4a (order corrected) works with both. I can try more.

@jungshik
Copy link
Author

Is there any documentation from Microsoft on how the script-specific and DFLT feature lists interact with script engines in OpenType?

In case of Harfbuzz, if 'bali' is present in a font, hb's implementation of USE is used. If not, the default shaping engine is used (with DFLT tables).

@jungshik
Copy link
Author

Multiple vowels in a row

Test file (line 1 has two vowels in a row. line 2 uses NFC form)
multi_dep_vowels.txt

  1. hb's USE (Noto Sans Balinese with both DFLT and non-DFLT/bali present):
    multi_dep_vowels old
  2. hb's dflt (Noto Sans Balinese with DFLT only):
    multi_dep_vowels new

With both DFLT and bali, hb's USE is used and it works as expected. With only DFLT, reordering is broken.

@kmansourMT
Copy link

Based on all the exchanges above, the sequences that appear to be functional errors in the font are summarized by
screen shot 2015-12-04 at 16 50 26

Basically, the sequences that need further verification/correction consist of the pattern:
base consonant + subjoined consonant + {vowels u1B38–1B3D}

@jungshik
Copy link
Author

@kmansourMT : There are more issues than what you referred to in the previous comment.

What you wrote about above is line 3 in my comment ( https://github.com/googlei18n/noto-fonts/issues/572#issuecomment-158562355 ).

In addition, line 1a and 2 also have issues ( bug notofonts/noto-fonts#380 ). So does line 6.

@KrasnayaPloshchad
Copy link

HarfBuzz got improvement for Balinese, and the improvement is landed in 1.1.3.

@marekjez86 marekjez86 self-assigned this Jan 15, 2016
@kmansourMT
Copy link

In the forthcoming version of Noto Balinese, we have corrected the problems previously encountered with the pattern "base consonant + subjoined consonant + {vowels u1B3C,1B3D}". The following demonstrates the changes that have been applied.

183c-subscripts-corrected

@jungshik
Copy link
Author

jungshik commented Jun 6, 2016

@kmansourMT : When do you plan to deliver the aforementioned update? The latest I have in phase 2 (TTF) is from Sep 2015.
( https://github.com/googlei18n/noto-source/tree/master/src for Phase 3 does not have Balinese source, either)

Anyway, I guess you fixed these and related sequences.

U+1b13 U+1b44 U+1b13 U+1b3c 
U+1b13 U+1b44 U+1b13 U+1b3d

Does your update handle the following sequence as well?

U+1b13 U+1b44 U+1b13 U+1b38 

@jungshik
Copy link
Author

jungshik commented Jun 6, 2016

Oh... U+1b13 U+1b44 U+1b13 U+1b38 appears to have been ok even in Sep 2015 version.

@kmansourMT
Copy link

Susan W. is in charge of scheduling releases and updates.

From: jungshik <notifications@github.commailto:notifications@github.com>
Reply-To: googlei18n/noto-fonts <reply@reply.github.commailto:reply@reply.github.com>
Date: Monday, 6 June 2016 at 11:48
To: googlei18n/noto-fonts <noto-fonts@noreply.github.commailto:noto-fonts@noreply.github.com>
Cc: Kamal Mansour <kamal.mansour@monotype.commailto:kamal.mansour@monotype.com>, Mention <mention@noreply.github.commailto:mention@noreply.github.com>
Subject: Re: [googlei18n/noto-fonts] Balinese shaping issues (#572)

@kmansourMThttps://github.com/kmansourMT : When do you plan to deliver the aforementioned update? The latest I have in phase 2 (TTF) is from Sep 2015.
( #572https://github.com/googlei18n/noto-fonts/issues/572 for Phase 3 does not have Balinese source, either)

Anyway, I guess you fixed these and related sequences.

U+1b13 U+1b44 U+1b13 U+1b3c
U+1b13 U+1b44 U+1b13 U+1b3d

Does your update handle the following sequence as well?

U+1b13 U+1b44 U+1b13 U+1b38


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHubhttps://github.com/googlei18n/noto-fonts/issues/572#issuecomment-224051316, or mute the threadhttps://github.com/notifications/unsubscribe/AMJehQ0inEh8ZnTIhvk_uZeYYIkV1qV8ks5qJGttgaJpZM4Gk8qB.

@jungshik jungshik assigned waksmonskiMT and unassigned marekjez86 Jun 6, 2016
@jungshik
Copy link
Author

jungshik commented Jun 6, 2016

Summary of this bug so far:

@waksmonskiMT, when do you plan to deliver the update @kmansourMT mentioned that fixes both "line 3" issue and lines 1a/2 issue (also bug notofonts/noto-fonts#380)? With the font update, I can close this bug after verifying the fix.

@jungshik
Copy link
Author

jungshik commented Jun 8, 2016

off-line conversation with @waksmonskiMT: we'll get @kmansourMT's fix in upcoming phase 3 delivery.

@behdad
Copy link

behdad commented Jul 14, 2017

off-line conversation with @waksmonskiMT: we'll get @kmansourMT's fix in upcoming phase 3 delivery.

Any updates?!? It has been over a year.

@simoncozens simoncozens transferred this issue from notofonts/noto-fonts Jul 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants