-
-
Notifications
You must be signed in to change notification settings - Fork 149
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ToAscii: Some russian texts in Cyrillic (Azbuka) are wrong rewrited #203
Comments
@dg Yes, I detected it by: /**
* Converts UTF-8 string to ASCII.
*/
public static function toAscii(string $s): string
{
static $transliterator = null;
if ($transliterator === null && class_exists('Transliterator', false)) {
$transliterator = \Transliterator::create('Any-Latin; Latin-ASCII');
}
dump(['transliterator' => $transliterator, 'iconv' => ICONV_IMPL]); Dumped: Thanks. |
Transliterator converts Probably solution is to add |
I think we can use (for Czech language): [
'ё' => 'jo',
'ъ' => '',
'ы' => 'y',
'ь' => '',
'э' => 'eh',
'ю' => 'ju',
'я' => 'ja',
] But for correct behavior is very important use target language (which can be loaded from second parameter And table for English: [
'ё' => 'jo',
'ъ' => '',
'ы' => 'y',
'ь' => '',
'э' => 'e',
'ю' => 'yu',
'я' => 'ya',
] |
Hi,
I can speak Russian language and I noticed that the transcription of Cyrillic (Azbuka) to Ascii is sometimes inaccurate.
For example string:
should be
dlja
, but not currentdla
, because charя
meansja
(Czech) orya
(English).Sample:
If my suggestion makes sense I can implement better translation with support for whole syllables and special cases.
Thanks.
The text was updated successfully, but these errors were encountered: