Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic verification of language codes for new translations #3335

Open
moisesbr-dw opened this issue Nov 13, 2020 · 3 comments
Open

Automatic verification of language codes for new translations #3335

moisesbr-dw opened this issue Nov 13, 2020 · 3 comments

Comments

@moisesbr-dw
Copy link
Contributor

Recently new translations were added for ckb (Sorani or Central Kurdish) and nan (Southern Min or Minnan). These codes are present in ISO 639-3, but I don't know whether the Dokuwiki translation system really verifies them. I guess there is no verification, because Nias language was added as id-ni instead of nia (see #3252).

Could you please give some tips about where a translator informs a new language code, so this process could be enhanced with checks for correct ISO 639-1 (2-letter) and ISO 639-3 (3-letter) codes? This verification is needed since the language code is used to generate a collator since #3115 was merged into the main code.

Remark: This issue is about new language codes, not existing language codes not conforming to ISO 639. For those, please see #3242 and #3252.

@Klap-in
Copy link
Collaborator

Klap-in commented Nov 21, 2020

New languages are added by doing first a commit with at least one translated file in the new language folder.

We could add an unit test that checks if the the new language code is correct. Then developers get at least a warning, on which they can act.

Do you have a complete list of the correct language codes?
You are welcome to build the unit test, otherwise others can made the unit test.

@Klap-in
Copy link
Collaborator

Klap-in commented Mar 7, 2021

Hi, I try to add a pointer to a list of desired correct language codes to this: https://www.dokuwiki.org/localization#howto_add_a_new_language
What is a good list, this one? https://www.loc.gov/standards/iso639-2/php/English_list.php ? Or from Wikipedia https://nl.m.wikipedia.org/wiki/Lijst_van_ISO_639-codes

@moisesbr-dw
Copy link
Contributor Author

Hi @Klap-in , sorry for the late reply.

I use these lists from Wikipedia:

One can search the pages for a language name and find the corresponding code.

As far as I know language codes in 639-1 also appear (differently) in 639-2 and 639-3, and are the preferred form, as they are the de facto codes for "big" languages. For example, Portuguese is pt in 639-1 and por in 639-2 and 639-3.

I guess 639-3 is a bigger set comprising all codes in 639-2, plus ancient and minority languages.

The official page at ISO (https://www.iso.org/iso-639-language-codes.html) does not show the codes. It only provides information about the standard and a means to buy it.

I've been very busy in the last months, but I still look forward to contribute code to tackle this issue and also #3242 and #3252.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants