Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Which transliteration schemes should we use? #16

Closed
r12a opened this issue Nov 24, 2015 · 16 comments
Closed

Which transliteration schemes should we use? #16

r12a opened this issue Nov 24, 2015 · 16 comments

Comments

@r12a
Copy link
Contributor

r12a commented Nov 24, 2015

Looking at the way Najib spelled 'Riqaa' made me realise that we need to establish some standard approach to transliterating Arabic and Persian words. I really don't want this to get in the way of creating real content for the document, but it's something we should take a look at (and spend as little time on as possible).

@ntounsi
Copy link
Contributor

ntounsi commented Nov 24, 2015

Agree with you to establish some standard approach to transliterating Arabic and Persian words.
I didn't give it much attention for the first pen.
BTW, It is Riqa’a or Riqa’ (الرقاع). Letters Alef+Ain should be a'a or simply a' ?
Same for Nasta'liq (نستعليق).
Najib

@khaledhosny
Copy link

BTW, it is رُقعة not رِقعة AFAIK.

@ntounsi
Copy link
Contributor

ntounsi commented Nov 26, 2015

In fact, Riqa’ is (one of) the plurals of Roq’a
Roq’a meaning sheet or patch.

A dictionary of Arabic meanings
http://www.almaany.com/ar/dict/ar-ar/الرقاع/
http://www.almaany.com/ar/dict/ar-ar/الرقعة/
gives one of the definitions of Roq’a (and thus Riqa’) as
“a piece of paper or leather on which one writes”
الرُّقْعةُ : قطعةٌ من الورق أو الجلد يكتب فيها
That suits to us...

Some other dictionary has the same position.
http://dictionnaire.sensagent.com/خط%20الرقعة/ar-ar/
http://dictionnaire.sensagent.com/رقاع%20%28خط%29/ar-ar/

Comment:
Like other scripts, Roq’a has an history and many authors/artists .
Some literature talk about Roq’a, some other talk about and Riqa’ and some both. I suspect they talk about the same script, but I am wondering if there is a reason (regional or historical?) why some says Riq’a and others Roq’a. So I am going do some further investigation just to have it clear.

Najib

@khaledhosny
Copy link

They are two different things, الرقاع is an early Arabic style that fell out of use, while رُقعة is a much more recent style invented by Ottoman calligraphers. Some authors confuse them and think they are the same thing.

@r12a
Copy link
Contributor Author

r12a commented Nov 26, 2015

Here is some information about Arabic transcription in a handy form.
https://en.wikipedia.org/wiki/Romanization_of_Arabic
i'm inclined to suggest using the ALA-LC with short vowel signs to make a transcription rather than a transliteration. It's use of h in sh, gh, etc, is fairly widely understood.

Wikipedia also lists transcription schemes for Persian:
https://en.wikipedia.org/wiki/Romanization_of_Persian

@r12a
Copy link
Contributor Author

r12a commented Nov 26, 2015

(By the way, fwiw, i'm working on Arabic and Persian pickers, which should eventually be able to produce transcriptions automatically or semi-automatically, which may help for consistency when creating content. You can see the kind of thing by going to http://rishida.io/pickers/bengali/ pasting in ছখজদ and clicking on the 'toISO' button. There's some ambiguity about the appropriate transcription for the end – just click on one of characters on the yellow background to resolve that.)

@ntounsi
Copy link
Contributor

ntounsi commented Nov 26, 2015

Riqa' vs Roq'a
It seem that Riqa' is much older than Roq'a

@shervinafshar
Copy link
Contributor

As fascinating as etymology "رُقعة" and "رِقعة" is to me I guess the intention was to agree on using an existing transliteration scheme here; either from those linked here by Richard or from any other resource that you encountered in your research. Any thoughts on that?

@r12a
Copy link
Contributor Author

r12a commented Dec 7, 2015

fwiw, there's an early version of an Arabic picker at http://r12a.github.io/pickers/arabic/. I mention it because it may help to take out some of the guesswork and bring some consistency when creating transcriptions of Arabic words in our text.

if you paste in a word such as رُقعَة and click on toLoc it will produce an LOC transcription. The algorithm takes into account various rules about combinations, but there sometimes remain ambiguities. For example, رُقعَة produces two alternatives for teh marbuta on a yellow background – simply click on the one you want (h) to finalise the transcription. There are a couple of things that still need attention, and i'm still working on it, but it largely works.

Hints: don't add vowels to the definite article (ال), but put them just about everwhere else; no need usually for sukun; still need to work on some īya sequences; read the notes for more info about how to use the picker.

@r12a
Copy link
Contributor Author

r12a commented Dec 7, 2015

btw, the picker has a couple of additional functions that may prove handy when editing the alreq doc.

if you want to mention an arabic letter with its Unicode name and code point in the text, highlight the letter in the picker (or make it the sole character in the box) and click on charLink. This will produce markup such as the following:

<a href="/scripts/arabic/block#char06A7"><span class="uname">U+06A7 ARABIC LETTER QAF WITH DOT ABOVE</span> (<span lang="ar" dir="rtl">ڧ</span>)</a> 

actually you can do this for multiple code points at the same time, if you want.

alternatively, if you want to produce an example for the text, input or highlight the relevant text and click on makeEg. See the picker notes for how to produce examples marked up for the arabic text, plus zero or more of the following: a latin transcription, an IPA transcription, and the meaning. For example:

العربي/alʕarabiː/alarabī/Arabic  (actually the arabic will appear to the right as you type or paste this into the input box)

will produce

<span class="ex" lang="ar" dir="rtl">العربي</span> <span class="ipa">alʕarabiː</span> <span class="trans">alʻarabī</span> <span class="meaning">Arabic</span>

in either of the above cases, just triple click on the output and copy then paste into the source.

@r12a
Copy link
Contributor Author

r12a commented Dec 7, 2015

i recommend we follow http://www.loc.gov/catdir/cpso/romanization/arabic.pdf for the Arabic transcriptions. (The rules in that document are taken into account in the picker.)

@mostafah
Copy link
Contributor

mostafah commented Dec 8, 2015

I checked ALA-LC’s Persian transliteration method. It seems reasonable a lot of times, but some of its rules, specially with vowels, seems a bit strange for me, like using “i” and “u” where I expected “e” and “o”. I found UN 2012 (documented here and here) a bit more familiar.

@ntounsi
Copy link
Contributor

ntounsi commented Dec 8, 2015

In general, I am not inclined to "very strict" transcription. (I prefer to read iman rather than īmān etc.)
Of course, letters like Waw و for example, may have a consonant role (wa) or a long vowel role (ū)
To my experience, some authors, e.g. orientalists, use formal transcription. Some others, targeting a wider audience, don't use transcription.

@behnam
Copy link
Member

behnam commented Dec 15, 2015

Okay, one open question here is transliteration choice for Persian.

@r12a r12a changed the title Transliterations Which transliteration schemes should we use? Dec 15, 2015
@shervinafshar
Copy link
Contributor

I had a long overdue action item to take a look at transliteration schemes available (ALA vs. UNGEN proposed by Iranian ICSGN National/NCC organizations).

I tend to agree that UNGEN scheme of 2012 is more reflective of common transliteration practices; e.g. Encyclopedia Iranica scheme. As Mostafa mentioned, ALA treatment of "e" and "o" is peculiar.

@r12a
Copy link
Contributor Author

r12a commented Nov 20, 2019

Just to clarify the resolution of this issue: for transcriptions, we decided to use LOC for Arabic, and UNGEN for Persian. And we use those as transcriptions, not transliterations: ie. they don't necessarily allow unambiguous reconversion back to the arabic script, although they may be closer to the actual phonetics than a transliteration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants