New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revisiting transliterations #2178
Comments
Translation has always been a hot topic! I think we can assume that handle are:
Which makes the problem easier to solve. But I would agree that Symphony could provide extendability on this particular thing, since it may be the will of a developer to change it. For that, we would need to figure out a correct way to "distribute" the work. Right now, I do not know any delegates that needs a return a value when it gets triggered and this would look weird. Also, should a handle change it's value if the current language of the author that hits the save button is not the same as the last author ? I would not want my urls to change unless the actual value of the field changes. |
Not sure if we're on the same side here. Handles are used in the frontend, so they shouldn't reflect or depend on any authors backend language at all imo. |
Yes that's totally right. I was referring to your comment:
Handles must not care what is the current backend language. It must be deterministic. So extendability is hard since is must preserve the deterministic nature of this. |
Just to reply on this. Yes. At the moment if you have one Author in Chinese who creates an entry, the handle will be created with Chinese transliterations. If you then get an Author in German edit the same entry, the handle will use German transliterations. This may result in some differences and the handle changing. This also occurs for all resources in the backend, datasources, events, field labels etc. So how can we solve this? I have no idea to be honest. We can't prevent Authors of difference languages editing each other's content. I think it would be strange to always just enforce English transliterations. Can we get by without using them at all for handles? What would be the impact? Is this actually a problem that needs solving? Has anyone actually experienced an issue with two different languages resulting in the handle changing and breaking the site? |
The system language (not the author's language) should be used as standard.
Yes. We normally use an English backend for development (personal habit) but our clients use the German local. We've often run into the problem of differing handles. |
That's one problem, but not the only one. One author editing content for multiple languages (multilingual website) is a problem as well. Just because an author uses the backend in one language, we can't assume that the content is in the same language, or that all content is only in one language.
I think so.
As mentioned above, this is only one of the issues. The more common problem is with multilingual sites. http://www.getsymphony.com/discuss/thread/107626/
Indeed, this wouldn't solve the problem at all.
I think it's ok for URLs (and even domains) to have unicode characters in them, so we should only strip out (instead of replacing with language specific terms) characters that are not allowed in URLs (like spaces).
Again, what if the frontend is multilingual? |
Ah, right, sorry. A two-step idea:
|
Not necessary to store multiple handles and switch based on frontend context. A field only holds content in one language (right?), so the handle for a field only needs to be in one language as well. Except you're planning to make all core fields multilingual and store content for each field in different languages (like the Multilingual xxx Field-extensions).
Could work, but you'd need to set the language for every field in the section editor. To be clear, your proposed solution would work as well and setting up sections could remain as it is now, since handles for different languages would be created in the background automatically. But I think it's a somewhat "dirty" solution, since you're cluttering up the database with lots of unnecessary handles when only one handle is needed. Also you'd have to take care of datasource filtering etc. Wouldn't it be much simpler to not replace characters with language specific terms at all? |
Can you elaborate, please? |
What about adding a language selector to all core fields that create handles? |
PS: There could also be an additional setting "Handle localisation" in the system preferences right under the language selector providing three options:
(The latter providing the aforementioned language selector in the field settings.) |
Sounds complicated and possibly introduces more new issues than it resolves...
Simple. If we don't use transliterations and don't replace certain characters like HTML5 supports IRIs and IRIs are capable of handling unicode characters, so do we actually still need transliterations? Just strip out reserved characters and everything's fine. Only thing to keep in mind is to |
So you'd like to keep umlauts in the URL?
I think browser support is not an issue. The actual problem is an audience without these characters on the keyboard. If I think of German umlauts, they'd be an accessibility problem outside German speaking countries. |
Mhh. Good point. Would it be possible to have generic (not language specific) transliterations? Also, are we talking about transliteration or transcription here? Since transliterations still have special characters, I think it's actually the latter and we're not using the term correctly. Edit On the other hand, does every country has ASCII characters on their keyboards? Update Apparently they do.
|
Thinking more about it, what about internationalized domain names? Same accessibility issues, I guess? And if this wasn't a concern for domain names (although I'm curious why not), is it really on us to worry about it for URL handles? |
Sure. How should an Englishman ever type "blödsinn.de" on his Keyboard? |
|
Oh, I would love to tell him that on the phone, proudly presenting my new website. :-) |
As I said, I'm wondering why this wasn't a concern for domain names. I mean, you can always argue to only use a german domain for a german audience. Same for handles. If a URL has german umlauts in it, chances are that it's not the URL you're looking for in the first place if you don't have german umlauts on your keyboard. But what about the edge cases? What about people living abroad, for example? |
There is a good reason why international domain names are so successful ;) I don't know of any good example of a German organisation using umlauts in their domain besides redirecting to the transliterated equivalent. While umlauts works fine inside German speaking countries, you kind of exclude people from other countries (Michael's simple example proofs that) – and the internet is global, not local. What would you do, if you needed a cedilla for a French domain or link (which is what we are talking about here)? Difficult on a non-french keyboard. Transliteration are a common solution to this problem – in written form generally, not on the web exclusively. |
PS: By the way, even if a user was able to type umlauts, he would have to know them to correctly memorise a link. Think of the German eszett: people from other countries tend to think of it as "a strange b" – which is why letters from abroad are often addressed to "Beispielstrabe" instead of "Beispielstraße". So I really think umlauts in domain names or links are a bad idea. |
Agreed, so let's get back to solving the problem. Keep in mind, I'm not trying to defend a particular idea, only brainstorming here. Three possible solutions, imo.
Don't create language specific handles Would it be possible to have generic (not language specific) transliterations? Also, are we talking about transliteration or transcription here? Since transliterations still have special characters, I think it's actually the latter and we're not using the term correctly. Don't automatically create handles at all Developers could add a dedicated handle field to a section where handles are needed. The handle field would be a core input field. The core input field would get reflection and text formatter capabilities, so the handle field could grab content from another field, and transliterations (even language specific) could be added as text formatters. If an author is unhappy with the result, the handle could be edited. Also, reflection and text formatter capabilities for the core input field would benefit other uses cases as well. Only thing I don't know how to handle yet is how to create handles for structural settings (section names, field names, data sources, events). I think always using english transliterations in the backend would be fine, though. |
I am all for
|
+1 :) |
I wanted to bring this topic up a while ago but forgot about it.
The core function for creating handles uses transliterations. Transliterations can be
overwritten per language by extensions.
If I'm not mistaken, what transliterations are used to create a handle always depends on what language your backend currently uses.
Is this correct?
I think that's a problem for multilingual websites and in cases where the backend language doesn't match the frontend language.
Symphony usually doesn't assume anything regarding your frontend, so this seems a bit weird.
Any ideas?
The text was updated successfully, but these errors were encountered: