YaST, the installer of SUSE Linux Enterprise Server, is localized to many languages, one of them being Arabic. Dealing with text that is a mixture of Latin and Arabic script is complex and sometimes we have to deal with interesting bugs. Here are some of them, along with the way we have solved them. Hopefully it will be useful for you.
- Wikipedia: Bi-directional text, an overview of the concepts
- Unicode Standard Annex #9: Unicode Bidirectional Algorithm, the gory details, 50 pages of them
Test bsc#989383, where parentheses are broken in a LTR layout but correct in a RTL layout.
Run this (never mind that the browser or the terminal may render the argument incorrectly already):
./yast-test.rb "ذ&اكرة USB كبيرة السعة التخزينية (عصا، قرص ذاكرة USB)..."
Here the global context is left to right. Even if you cannot read Arabic, you can observe that some of the widgets have both parentheses facing the same way, which is wrong:
When we set the context to right-to-left, the labels are rendered correctly:
This is a case of an English word with a parenthesised English gloss in an Arabic text. Let's say "I love GNU (GNU's not Unix)":
./yast-label.rb "انا أحب GNU (Gnu's Not Unix)."
It helps to add an invisible right-to-left-mark (U+200F) between the GNU and the opening parenthesis:
./yast-label.rb $'انا أحب GNU \u200F(Gnu\'s Not Unix).'
"... may be set if hostname(5) does not reflect the FQDN ..."
./yast-label.rb $'يمكن تعيينه إذا كان hostname(5) لا يعكس الاسم المميز المؤهل بالكامل'
Fixed by adding a Left-to-right mark, U+200E, after the (5):
./yast-label.rb $'يمكن تعيينه إذا كان hostname(5)\u200e لا يعكس الاسم المميز المؤهل بالكامل'
The English string is "URIs (ldap://) of LDAP servers (comma separated)".
If we use no control characters, it comes out quite confused:
./yast-label.rb $'عناوين URI (ldap://) لخوادم LDAP (مفصولة باستخدام الفاصلة)'
Adding a RLM before "ldap" helps a bit:
./yast-label.rb $'عناوين URI (\u200fldap://) لخوادم LDAP (مفصولة باستخدام الفاصلة)'
But the full fix is trickier: surrounding the parens with LRMs and separating them from "URI" with a RLM.
./yast-label.rb $'عناوين URI \u200f\u200e(ldap://)\u200e لخوادم LDAP (مفصولة باستخدام الفاصلة)'