-
-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(i18n): able to finetune transliterator #12378
Conversation
|
Related: #11880 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need better documentation, ideally on jellyfin.org. Probably also should mention this is in changelog as user actions needed to fix undesired behaviors. Otherwise LGTM
| private static readonly Lazy<Transliterator> _transliterator = new(() => Transliterator.GetInstance( | ||
| "Any-Latin; Latin-Ascii; Lower; NFD; [:Nonspacing Mark:] Remove; [:Punctuation:] Remove;")); | ||
| private static readonly Lazy<string> _transliteratorId = new(() => | ||
| Environment.GetEnvironmentVariable("JELLYFIN_TRANSLITERATOR_ID") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm personally fine with reading env var directly but I'm not sure if there is any better way to integrate the configuration for string extensions
Agree. Related to this, maybe we should simply change |
Actually it is not. Computers do sort non ascii characters but may or may not in the human acceptable way and using ascii only characters is a easier way to control sorting behavior. But if users really want to input a non-ascii character it is also OK. |
Is there any real-life usage example to help me understand better about this usage? If it's intended to allow |
Because there are too many of them
For example the ascii order of romaji of hiragana and katakanas are not in the same order as the Japanese traditional あいうえお order. Language is just complicated and you cannot assume the ASCII order is the most correct order. |
If a user has too many items which starts with a non-ascii character, then I'd say the right-hand side filter is completely useless for such user. Things like Alphabetic Index might be helpful to improve this, while I'm not 100% sure about it. Whatever if that can be improved, it will be a completely new discussion topic which is not related to this PR. Anyway, related to this PR, as a conclusion, we won't change the copywriting of "Sort Title" to something else :) |
Changes
After #11172, this gives user the ability to finetune the behavior of the existing transliterator to fit their own need. All they need to do is set the
JELLYFIN_TRANSLITERATOR_IDenvironment variable to a preferred value, which is explained in this page and this page.This approach is more flexible than only offers an option to disable the ICU transliterator, users can then:
Hangul-Latin; Hiragana-Latin;instead ofAny-Latin;so only Hangul and Hiragana characters will get transliterated. Other non-ascii characters (e.g. Kanji and Chinese characters) will be untouched.Some side notes (some of the notes have nothing to do with the changes made in this PR though):
ModifySortChunksBefore the ICU transliterator gets used. Keep that in mind when choosing the right transliterator rule (Transliterator ID).Issues