-
-
Notifications
You must be signed in to change notification settings - Fork 9.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Differences in transliteration compared to behat/transliterator #35061
Comments
Thanks for the report and the links that's interesting. A core design principle of the Symfony slugger is that we want to reuse standard data as a source: we're not linguists and we have no authority to maintain translit maps. The source we use is from Unicode, the CLDR sub-project maintains maps here: E.g. for This raises two questions to me:
Maybe the missing maps should be reported to the CLDR project? |
What Nicolas said is true. Symfony's rules should be correct because they aren't our rules but CLDR rules. Also, in some cases (but not in all of them) Google also agrees with Symfony/CLDR. Example: About the "return empty string vs throw exception" I think it'd be better to throw the exception. An empty slug will probably break the app anyway ... so better warn the developer. |
That's a tricky one.. That way the invoking side would be forced to deal with the situation (and generate a random slug for example) |
Well, I would say that if the input is the empty string, we should still return the empty string as output. That's the sane transliteration for it. The exceptional case would be a non-empty input producing an empty output. |
But trying to create a slug for an empty string.. Does that make sense? |
The issue is how do we handle Khmer chars found in the middle of non-Khmer ones. |
Just a thought on future backwards compatibility: |
I would add another important difference: Symfony is not lowercasing string. |
…nicolas-grekas) This PR was merged into the 5.2-dev branch. Discussion ---------- [String] allow translit rules to be given as closure | Q | A | ------------- | --- | Branch? | master | Bug fix? | no | New feature? | yes | Deprecations? | no | Tickets | Fix #35061 | License | MIT | Doc PR | - Instead of trying to fix #35061 at our level, I propose to add a hook so that ppl can use a transliterator that fits their need (eg voku/portable-ascii or behat/transliterator) while still relying on the `SluggerInterface`. Commits ------- 0bb48df [String] allow translit rules to be given as closure
@garak we transliterate to ASCII. ASCII has uppercase letters. |
Well, it also has a lot of other characters that I would not put in a slug. |
|
I guess you meant |
Symfony version(s) affected: 5.0.2
Description
We used
behat/transliterator
until now, but because of issues with PHP 7.4 look into replacing it. I switched our codebase to usesymfony/string
and run into some differences with the result of transliteration to ASCII.Notes:
symfony/strings
is backed by http://dzcpy.github.io/transliteration for汉语
(han-yu
)behat/transliterator
is backed by http://dzcpy.github.io/transliteration forភាសាខ្មែរ
(bhaasaakhmaer
),한국어
(hangugeo
) andहिन्दी
(hindii
)العَرَبِية
andမြန်မာဘာသာ
http://dzcpy.github.io/transliteration disagrees with both librariesHow to reproduce
The text was updated successfully, but these errors were encountered: