You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
id: "tk_Arab"
script: "Arab"
...
exemplar_chars {
base: "a b ç d e ä f g h i j ž k l m n ň o ö p r s ş t u ü w y ý z"
auxiliary: "c q v x"
numerals: " - ‑ , % ‰ + 0 1 2 3 4 5 6 7 8 9"
punctuation: "- ‑ – — , ; : ! ? . … \" “ ” ( ) [ ] { } § @ * # { } { } { } { } { } { } { } { } { } { } { } { } { } { } { }"
index: "A B Ç D E Ä F G H I J Ž K L M N Ň O Ö P R S Ş T U Ü W Y Ý Z"
}
id: "ku_Cyrl"
language: "ku"
script: "Cyrl"
...
exemplar_chars {
base: "a b c ç d e ê f g h i î j k l m n o p q r s ş t u û v w x y z"
auxiliary: "á à ă â å ä ã ā æ è ĕ ë ē é ì ĭ ï ī í ñ ó ò ŏ ô ø ō œ ß ù ŭ ū ú ÿ"
marks: "◌̆ ◌̈"
punctuation: "- ‐ ‑ – — , ; : ! ? . … \' ‘ ’ \" “ ” ( ) [ ] § @ * / & # † ‡ ′ ″"
index: "A B C Ç D E Ê F G H I Î J K L M N O P Q R S Ş T U Û V W X Y Z"
}
I don't know how to fix this, because I'm not sure how they were generated. Just delete them?
The text was updated successfully, but these errors were encountered:
I now do understand where these came from, but they're clearly wrong and interfering with script detection - when we grab a list of Arabic words, we get Latin ones as well because the Latin characters are included in a language which is nominally Arabic. We should at least remove the exemplar characters
I don't know how to fix this, because I'm not sure how they were generated. Just delete them?
The text was updated successfully, but these errors were encountered: