Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IBM Plex Sans Arabic issue with Tashkeel #7611

Open
AMKamel opened this issue Apr 28, 2024 · 7 comments · May be fixed by #8475
Open

IBM Plex Sans Arabic issue with Tashkeel #7611

AMKamel opened this issue Apr 28, 2024 · 7 comments · May be fixed by #8475
Assignees
Milestone

Comments

@AMKamel
Copy link

AMKamel commented Apr 28, 2024

Specify the font name in title with a short description of the bug.
Please report any issue related to Noto fonts here.
Report any issue with Google Icon/Symbols here

Describe the bug
When you type the word اللَّهُ it get's presented in a wrong way like this
image

To Reproduce
Type the word اللَّهُ
Expected behavior
It should appear as it is appearing above
image

Screenshots
Added
Additional context
N/A

@emmamarichal
Copy link
Collaborator

@yanone could you take a look?

@yanone
Copy link
Collaborator

yanone commented May 8, 2024

This is a problem that exists with many Arabic fonts, even my own (work-in-progress).

These fonts have a ligature for الله which already includes tashkeel in its presentation which are not typed, so a user just types ا ل ل ه and it turns into the ligature including tashkeel. This الله that I typed here should probably also contain tashkeel.

That's of course problematic. In the issue above, the typed text itself contains tashkeel, which are then applied on top of the ligature that already contains tashkeel.

The easiest solution would be to remove the composed ligature from the font, requiring users to explicitly apply tashkeel. But this has not been the common practice in font-making. The common practice until now has been to type plain ا ل ل ه and receive a ligated الله incl. tashkeel.

Another solution is to correctly apply OpenType feature code.

In my own font (just now for testing) I've been able to eliminate the tashkeel collisions by removing the IgnoreMarks modifier for the allah-ar character, so the code looks like this:

lookup rlig_arab_1 {
  # Arabic
  script arab;
    # Default
    language dflt;
	lookupflag RightToLeft;
	sub alef-ar lam-ar.init lam-ar.medi heh-ar.fina by allah-ar;
} rlig_arab_1;

Which means that, as soon as marks are involved, the ligature will not be substitued. The results speaks for itself:

Bildschirmfoto 2024-05-08 um 14 08 08

Before we move on to fix the issue in any of the fonts, I would like to proceed with defining a Fontbakery check for this.

I would make this check into a shaping check, counting the amount of glyphs after shaping.
For example: If the sequence ا ل ل ه turns into a single glyph, then also measure what ا ل ل ه ُ turns into. If it's two glyphs (ligature + tashkeel), the OpenType code ignores marks which should turn into a FAIL.

@khaledhosny @simoncozens, what are your thoughts on this?

@khaledhosny
Copy link
Contributor

See IBM/plex#407

@simoncozens
Copy link
Collaborator

My thoughts are (a) it's a good candidate for a shaperglot check, and (b) https://www.unicode.org/notes/tn46/tn46-1.pdf

@yanone
Copy link
Collaborator

yanone commented May 10, 2024

@simoncozens Is such a check implementable using the current set of shaperglot instructions? If so, how would you implement it?

Khaled's idea in the linked thread of offering two ligatures is valid (tho Bold Monday's implementation of putting all marks on the second ل is surely wrong), as is offering just one ligature and ignoring it as soon as marks are present.

At least I don't see how this can be solved using a static shaperglot test definition.
Because you would have to compare the output buffers of ا+ل+ل+ه against ا+ل+ل+ه+ُ (for example), but check that just the base ا+ل+ل+ه differs. You need to compare two sequences that don't have the same input string, and so they are going to be different in any case.

I think this would be a dynamic check written in code. Then of course it makes no difference whether it's in FB or shaperglot, with the latter being the better host.

@khaledhosny
Copy link
Contributor

FWIW, this issue is fixed upstream. So updating the version on GF should fix the issue.

@khaledhosny
Copy link
Contributor

$ hb-info IBMPlexSansArabic-Regular.ttf --show-version
Version: Version 1.005
$ hb-view IBMPlexSansArabic-Regular.ttf "الله اللَّهُ"

yanone added a commit that referenced this issue Nov 8, 2024
yanone added a commit that referenced this issue Nov 8, 2024
@yanone yanone linked a pull request Nov 8, 2024 that will close this issue
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: PR GF
Development

Successfully merging a pull request may close this issue.

6 participants