Added UNGEGN Amharic 2016 system #32

manuelfuenmayor · 2019-11-29T19:54:24Z

Specification here: http://www.eki.ee/wgrs/rom1_am.htm

ronaldtse · 2020-09-11T05:17:30Z

Ping @tsega for this

tsega · 2020-09-12T06:41:07Z

@ronaldtse two issues why the tests are failing here:

Similar to the alalc-amh-ethi-latn-1997 the 6th order sometimes can be simply the consonant, e.g. '\u1215' (ሕ) can be transliterated as hi̠ or h; also pointed out in the notes.

(A) The vowel of the sixth order (i̠) is eliminated in spelling except when the actual pronunciation requires it (e.g. not Me̠ni̠gi̠si̠ti̠ but Me̠ngi̠st).

This I have corrected by adding the consonant only as the second entry in the yaml;

'\u1215' :       # ሕ
      - 'hi̠'
      - 'h'

The next issue is capitalization; in Amharic there is no concept of capitalization. The tests are expecting the first letter of every work to be a capital letter. This as per my knowledge is wrong.

How do I fix the second issue?

ronaldtse · 2020-09-12T15:35:40Z

@tsega thanks for this.

This I have corrected by adding the consonant only as the second entry in the yaml;

Should we instead use h as the first choice, since only when pronounced the i is used? Which situation is more common? We should use the more common situation as default.

The next issue is capitalization; in Amharic there is no concept of capitalization. The tests are expecting the first letter of every work to be a capital letter. This as per my knowledge is wrong.

What do locations at unstats.un.org do for capitalization? We probably should follow them, e.g. if they are all downcased we should also downcase.

tsega · 2020-09-13T05:26:29Z

@ronaldtse location names from unstats.un.org are all capitalized when transliterated into English since they are names of places. Is the sample data only names of locations? If so, then the tests are consistent in capitalizing the results and I would need to change my test data.

However, I was under the impression that the test data can come from anywhere and just has to be mapped correctly.

ronaldtse · 2020-09-13T06:14:21Z

We will need to use machine learning to decide whether a word is a name (place, people) or not, so at this point we can just keep all examples in lower case. If that's fine we can merge and consider this done.

ronaldtse · 2020-09-16T23:54:02Z

Completed in #414

manuelfuenmayor requested review from andrew2net and ronaldtse November 29, 2019 19:54

manuelfuenmayor self-assigned this Nov 29, 2019

manuelfuenmayor mentioned this pull request Nov 29, 2019

Add systems from UNGEGN RGMS #30

Closed

manuelfuenmayor and others added 2 commits February 25, 2020 00:36

Added UNGEGN Amharic 2016 system

0b8c28e

ungegn-amh-Ethi-Latn: correct letter case options

f908904

ronaldtse force-pushed the UNGEGN-Amharic-romanization branch from 087297e to f908904 Compare February 24, 2020 17:03

ronaldtse mentioned this pull request Jun 8, 2020

Implement Arabic transliteration and a "fully-pointed Arabic" form #309

Open

ronaldtse assigned AhMohsen46 Sep 5, 2020

ronaldtse assigned tsega and unassigned manuelfuenmayor and AhMohsen46 Sep 15, 2020

ronaldtse removed the request for review from andrew2net September 15, 2020 02:10

tsega mentioned this pull request Sep 16, 2020

Ungegn amharic romanization #414

Merged

ronaldtse mentioned this pull request Sep 16, 2020

Add UN Amharic transliteration 1967 system #390

Closed

ronaldtse closed this Sep 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Added UNGEGN Amharic 2016 system #32

Added UNGEGN Amharic 2016 system #32

Uh oh!

manuelfuenmayor commented Nov 29, 2019 •

edited by ronaldtse

Loading

Uh oh!

ronaldtse commented Sep 11, 2020

Uh oh!

tsega commented Sep 12, 2020 •

edited

Loading

Uh oh!

ronaldtse commented Sep 12, 2020

Uh oh!

tsega commented Sep 13, 2020

Uh oh!

ronaldtse commented Sep 13, 2020

Uh oh!

ronaldtse commented Sep 16, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Added UNGEGN Amharic 2016 system #32

Added UNGEGN Amharic 2016 system #32

Uh oh!

Conversation

manuelfuenmayor commented Nov 29, 2019 • edited by ronaldtse Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ronaldtse commented Sep 11, 2020

Uh oh!

tsega commented Sep 12, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ronaldtse commented Sep 12, 2020

Uh oh!

tsega commented Sep 13, 2020

Uh oh!

ronaldtse commented Sep 13, 2020

Uh oh!

ronaldtse commented Sep 16, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

manuelfuenmayor commented Nov 29, 2019 •

edited by ronaldtse

Loading

tsega commented Sep 12, 2020 •

edited

Loading