Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why are bidi categories of Arabic-indic & Eastern Arabic numbers different? #85

Open
ntounsi opened this issue Oct 25, 2016 · 4 comments
Labels
i:data_formats Data formats & numbers l:arb Arabic l:pes Persian l:ur Urdu question

Comments

@ntounsi
Copy link
Contributor

ntounsi commented Oct 25, 2016

Are all numbers equal in category and directional property?

  • Digit 2 (U+0032) is of category "EN, European Number". OK.
  • Arabic-digit indic ٢ (U+0662) is of category "AN, Arabic number". OK.
  • but the other ۲ (U+06F2), the Eastern Arabic-Indic counterpart of it, is of category "EN, European Number" like digit 2. Any reason to this difference between the last two?

There is also a difference in Bidi behavior : the same visual text a2b
will be displayed in RTL context as b2a if two is Arabic number, and a2b, if European number (simply like "a 2 b"). Aren't ALL numbers WEAK in directional property?

@duerst
Copy link

duerst commented Oct 25, 2016

As far as I remember, the difference in bidi category between Arabic-Indic digits and Eastern Arabic-Indic digits is due to the difference in bidi behavior desired in Arabic vs. Persian. Details should be available from Unicode.

@ntounsi
Copy link
Contributor Author

ntounsi commented Oct 26, 2016

http://unicode.org/reports/tr9/#AN
Section : 3.2 Bidirectional Character Types
"[...]

  • As of Unicode 4.0, the Bidirectional Character Types of a few Indic characters were altered so that the Bidirectional Algorithm preserves canonical equivalence. That is, two canonically equivalent strings will result in equivalent ordering after applying the algorithm."

I guess the "few Indic characters" are the Eastern Arabic-Indic digits in range U+06F0..U+06F9, which are classified "European Number" vs "Arabic numbers".
I wonder what is the "canonical equivalence" problem in question. Didn't find more details.

@khaledhosny
Copy link

I think it is referring to characters used for Indic languages, not the Arabic-Indic digits which AFAIK had this distinction from the start.

@behnam behnam changed the title Are all numbers equal? Bidi Category of numbers are different Jan 24, 2017
@behnam behnam added this to Ideas to discuss in Authoring ALReq 1.0 Feb 28, 2017
@ebraminio
Copy link
Contributor

ebraminio commented Mar 7, 2017

@shervinafshar and I had a discussion about this years ago here:
https://groups.google.com/forum/#!topic/persian-computing/602gqTIrlPQ because I found Arabic-Indic Extended to suit better for our use on a special case (but maybe is better on other cases).

I remember @roozbehp (which I guess won't get pinged by my mentioning here), somewhere on a very old mailing list discussion, something like 2001(?), wrote he was explaining to a developer why these are different, so if my memory on this is correct, perhaps he would be a good person to ask about the reason of the difference.

@r12a r12a added the question label Mar 8, 2019
@r12a r12a changed the title Bidi Category of numbers are different Why are Bidi Category of numbers different? Mar 8, 2019
@r12a r12a changed the title Why are Bidi Category of numbers different? Why are bdi categories of numbers different? Mar 8, 2019
@r12a r12a changed the title Why are bdi categories of numbers different? Why are bidi categories of Arabic-indic & Eastern Arabic numbers different? Mar 8, 2019
@r12a r12a added the i:data_formats Data formats & numbers label Feb 5, 2020
@r12a r12a added l:arb Arabic l:pes Persian l:ur Urdu labels Feb 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
i:data_formats Data formats & numbers l:arb Arabic l:pes Persian l:ur Urdu question
Projects
Authoring ALReq 1.0
  
Ideas + Discussions
Development

No branches or pull requests

5 participants