Fix opus_gnome dataset card#4806
Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
|
@gojiteji why have you closed this PR and created an identical one? |
|
@albertvillanova |
|
Both are identical. And you can push additional commits to this branch. |
|
I see. Thank you for your comment. |
|
Anyway, @gojiteji thanks for your contribution and this fix. |
|
Once you have modified the |
Is there anything I should do? |
|
If you would like to address them as well in this PR, it would be awesome: https://github.com/huggingface/datasets/runs/7741104780?check_suite_focus=true |
|
These are the 2 error messages: |
|
In principle there are 2 errors: The first one says, the title of the README does not start with
|
|
In relation with the languages:
|
|
Thank you for the detailed information. I'm checking it now. |
|
|
I added |
|
Thanks for your investigation and fixes to the dataset card structure! I'm just making some suggestions before merging this PR: see below. |
|
Should I create PR for Or removing |
albertvillanova
left a comment
There was a problem hiding this comment.
My suggestions for the language codes.
| - th | ||
| - tk | ||
| - tl | ||
| - tmp |
There was a problem hiding this comment.
This is a deprecated language tag: it should be replaced by tyj.
However, tyj should be added to our languages.json:
"tyj": "Tai Do; Tai Yo",
See reference here: https://iso639-3.sil.org/request/2015-019
| "ar-TD": "Arabic (Chad)", | ||
| "ar-TN": "Arabic (Tunisia)", | ||
| "ar-YE": "Arabic (Yemen)", | ||
| "ara":"Arabic", |
There was a problem hiding this comment.
When there exist several codes (ISO 639 1, 2, 3) for the same language, we just use the lower one. For Arabic, we use ar.
As ar already appears in the README, I would remove ara from the README and would not add it to languages.json
| "ara":"Arabic", |
| "byn": "Blin; Bilin", | ||
| "bzd": "Bribri", | ||
| "ca": "Catalan", | ||
| "cat": "Catalan; Valencian", |
There was a problem hiding this comment.
The same as above: for "Catalan; Valencian" we use ca instead. I would remove cat from README and from here.
| "cat": "Catalan; Valencian", |
| "gom": "Goan Konkani", | ||
| "gor": "Gorontalo", | ||
| "got": "Gothic", | ||
| "GR": "Greek", |
There was a problem hiding this comment.
Than language code for Greek is el. I would remove gr from README and from here.
| "GR": "Greek", |
|
Once you address these issues, all the CI tests will pass. |
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>
|
Once the remaining changes are addressed (see unresolved above), we will be able to merge this:
|
|
I did the five changes. |
albertvillanova
left a comment
There was a problem hiding this comment.
Thanks, @gojiteji for the original fix and all the subsequent improvements to the dataset documentation card.
I fixed a issue #4805.
I changed
"gnome"to"opus_gnome"in README.md.Fix #4805