Skip to content

Conversation

@ronaldtse
Copy link
Contributor

From @manuel489

@ronaldtse
Copy link
Contributor Author

@manuel489 I've updated the file and fixed some issues. Notice that not all characters can be copied and pasted, we will need to use the unicode numbers directly, sometimes multiple code points for a single character (e.g. n with top bar).

@ronaldtse
Copy link
Contributor Author

  1. Please help fill in the transliteration table.

  2. This document also defines the "Yaghoubi" system, which should be encoded as a separate YAML file.

@manuelfuenmayor
Copy link
Contributor

Hi @ronaldtse,
I have pushed my changes. I did both BNG/PCGN and Yaghoubi systems in separated files.

I'd like to make some comments:

  1. There are several cases where a Unicode value is repeated.
  2. I couldn't find an acronim for "Afghan" in ISO 639-2 code. So, I wrote "afg" according to ISO 3166-2.
  3. There were cases where I used four unicode values in a row to depict a Latin character.
  4. There were several Unicode values enclosed between parentheses (I didn't use them).

@ronaldtse
Copy link
Contributor Author

ronaldtse commented Nov 21, 2019

  1. There are several cases where a Unicode value is repeated.

Could you explain which these cases are?

  1. I couldn't find an acronim for "Afghan" in ISO 639-2 code. So, I wrote "afg" according to ISO 3166-2.

The language code should be prs for Dari (https://iso639-3.sil.org/code/prs). I fixed it in 0265be7.

  1. There were cases where I used four unicode values in a row to depict a Latin character.

Each source character should be implemented as a separate rule.

e.g. these:

 # VOWELS  
  'ئه' / 'ه' : 'e' # See notes 1 and 5
  'ئا' / 'ا' : 'a' # See note 1
  'ئي' / 'ي' : 'î' # See notes 1, 6 and 7

Should be implemented as:

 # VOWELS  
  'ئه' : 'e' # See notes 1 and 5
  'ه' : 'e' # See notes 1 and 5
  'ئا' : 'a' # See note 1
  'ا' : 'a' # See note 1
  'ئي' : 'î' # See notes 1, 6 and 7
  'ي' : 'î' # See notes 1, 6 and 7

@manuelfuenmayor
Copy link
Contributor

manuelfuenmayor commented Nov 21, 2019

Could you explain which these cases are?

@ronaldtse, for example:

capture

Comment on lines 54 to 56
a. Initially, it indicates that the word begins with a vowel or diphthong; the alif itself is not romanized, but rather the short vowel it “carr es” is romanized; e.g., ميړ أَسَلم ژرَندَه → Mī Aslam Zhrandah
a. Initially, it indicates that the word begins with a vowel or
diphthong; the alif itself is not romanized, but rather the short vowel
it “carr es” is romanized; e.g., ميړ أَسَلم ژرَندَه → Mī Aslam Zhrandah
Copy link
Contributor

@manuelfuenmayor manuelfuenmayor Nov 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ronaldtse, why text paragraphs must be splitted in several lines?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, because it reads better in the editor :-)

- 'a'
- 'â'

# Both e and i are available to romanize this short vowel,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same for these comments to be on separate lines. In addition, comments and code lines should be separate, so that git tracks the actual change in code vs comments, instead of mixing changes for code vs comments.

@AhMohsen46
Copy link
Contributor

@AhMohsen46 AhMohsen46 closed this Sep 20, 2020
@ronaldtse ronaldtse deleted the manual-bgnpcgn-afgn branch November 2, 2020 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants