-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x/text: localization support #12750
Comments
Here is one solution: https://github.com/maximilien/i18n4go |
@maximilien: i18n4go does not address localized formatting of values like numbers and I think it will be difficult to retrofit it properly. In case of selecting translation variants based on the linguistic features of the arguments, you'll end up with the same struggle one witnesses with localization frameworks for other languages. This proposal is fairly agnostic about translation pipelines, though. So it may be possible to fit this proposal on top of the i18n4go translation pipeline. Seems like a convenient first target. |
Is the format string by itself sufficient for determining the context? I can imagine a very simple Printf used like I must admit I'm not very familiar with localization problems and this may not be an issue in practice. |
@infogulch It is indeed not enough. In my provisionally worked out API I do define a Key function that can be used for things like adding meaning and alternatives. I left it out of the design doc to not go into details too much. (I also stripped about 1/3rd of my original draft; maybe I went a bit overboard.) Note that as the string has no meaning in itself, you could always write the format string as, for example, Printf would have the following signature:
where Reference is either a string or a result from a func like
This allows the familiar But the example string you provide does raise another good point: there may be format strings one does not want to translate at all while still using the message package to substitute localized values. This is possible as is (e.g. |
As far as plurals are concerned, I've seen some elaborate examples, but what somewhat skimming the doc, it seems they can use only "<", ">", and "=" operators; I didn't read it 100% thoroughly however, so I may be wrong. I'll thus let myself ask here for clarifications: are the proposed mechanisms enough to cater for the rule for e.g. Polish language? In a version I found on Weblate site, it's described as [1] [2]:
which seems to me quite fine, correctly giving e.g.:
|
@akavel: one should distinguish selectors from the rules you mention. The rules you refer to (which are defined in CLDR) would be used by the plural package to map numbers to a small set of plural categories (in the case of your example: single, few and many). The selectors subsequently pick alternatives based on these simplified categories. The maximum number of such categories, IIRC, is 6 (e.g. for Arabic). In my foreseen implementation, it is really up to the feature implementation to interpret selectors. This means that there is a lot of flexibility in supporting wild feature value matching. However, if one looks at linguistic grammars like LFG and HSPG, which use many more features, the set of possible feature values is usually small. The doc is indeed a bit sparse here (as well as all other topics, really). |
@mpvl, sounds good. Happy to try and integrate once you have something ready to try. Best. |
Have you guys seen this one? https://github.com/nicksnyder/go-i18n seems pretty solid at first sight. |
It uses JSON as its base format, has tooling to help with the translation workflows |
By the way, I submitted some formatting fixes for the proposal doc a few weeks ago. https://go-review.googlesource.com/19753 Not sure what I was supposed to do to get it reviewed. |
Any updates on how far the proposal is implemented in x/text/language, I find it a bit hard to figure out if this is anywhere near production readiness. |
x/text/language is definitely production ready. But if you mean the specific functionality of this issue, it is still under development. Lately the focus had been more on other parts, my intention for the upcoming months to specifically focus on segmentation and this. That said, string substitution is available with limited functionality, so you could play around with it. I recently checked in a tool to extract strings. |
Thanks for your reply, I have so far only used x/text/language in production and coded something around it that translates and formats messages for different countries. Just wanted to check if the language API is still up for changes. |
No plans to change. Works well enough to the point it is not useful breaking people. |
Hi , |
Hi @mpvl, others,
If I use ICU to sort these strings (using level 3 strength), then I get the strings back like this:
Am I setting up the collator incorrectly? |
Hello @MickMonaghan, look slike there is not so much interest in this discussion, I just add my findings so far. I looked into the collate code and could not really figure how the sorting is made up. I asked a Japanese friend of mine how he would sort a list of German and Japanese cities. So he either converts the Japanese into Latin or the Latin into Japanese alphabet and sorts it then. I think that is also a good way to sort this list, first translate the syllables into the other alphabet and then sort it correspondingly. |
Hey @morriswinkler-simplesurance - thanks for the response.
With a Korean sort, the Latin characters should be sorted after the Korean characters. But that's not happening. |
@MickMonaghan: the implementation is based on the CLDR UCA tables. If I look at the collation elements of both the DUCET (Unicode's tables) and CLDR (the tailorings) they both show Hangul to have a higher primary collation value then Latin. So that explains why Korean is sorted later. What is probably happening in ICU is that the the script for the selected language is sorted before other scripts. The Go implementation currently does not support script reordering, though. This is an TODO, but depends on changing the implementation to using fractional weights. This is a huge change and may take a while. |
@MickMonaghan: I suggest you file a separate issue for this so it can be tracked individually. |
@MickMonaghan: dates/times is on the list, but only after number etc. is completed. |
Thanks @mpvl , I'll log the collation bug |
I started trying out golang seriously today to create a small application just for fun. However when I tried to localize my little application I didn't figure out any good solution. I just got a big headache. This is what I would do normally in TypeScript export const Exceptions = {
"AuthenticationError": {
"Invalid": {
"en-GB": "Invalid username or password",
"sv-SE": "Fel användarnamn eller lösensenord"
},
"Required": {
"en-GB": "You must be authenticated to see this resource",
"sv-SE": "Du måste vara inloggad för att se denna resurs"
}
}
}
export class AuthenticationError extends Error {
constructor(language: "en-GB" | "sv-SE", message: "Invalid" | "Required") {
super(Exceptions.AuthenticationError[message][language]);
}
} I would get errors if I typed any string wrong and it would simply just work. I tried to do something similar in go but the pain just got unbearable, package localization
type labels struct {
enGB string
svSE string
}
type authenticationErrorMessages struct {
Invalid labels
Required labels
}
type exceptionMessages struct {
authErrors authenticationErrorMessages
}
// ExceptionMessage damnit, need to write a comment in an odd way.
func ExceptionMessage(language string, category string, exceptionType string, params []string) string {
var exceptionMsg = exceptionMessages{
authErrors: authenticationErrorMessages{
Invalid: labels{
enGB: "Invalid username or password",
svSE: "Fel användarnamn eller lösenord",
},
Required: labels{
enGB: "You must be authenticated to see this resource",
svSE: "Du måste vara inloggad för att se denna resurs",
},
},
}
switch category {
case "AuthenticationError":
switch category {
case "Invalid":
switch language {
case "enGB":
return exceptionMsg.authErrors.Invalid.enGB
case "svSE":
return exceptionMsg.authErrors.Invalid.svSE
}
case "Required":
switch language {
case "enGB":
return exceptionMsg.authErrors.Required.enGB
case "svSE":
return exceptionMsg.authErrors.Required.svSE
}
}
}
return "Error message not found"
}
// AuthenticationError damnit, need to write a comment in an odd way.
func AuthenticationError(message string) string {
return ExceptionMessage("enOps", "AuthenticationErrors", message)
} TL;DR
So far everything has been really smooth writing golang code but this is just painful. I've tried out some localization packages as well but that hasn't worked out well so far. I'm of course not an expert in go after less than a day, maybe I missed something obvious in the language specification when I went through it this morning but regardless I'd really like to see some progress on this issue. |
Just wanted to know the status of the repo, especially that of the gotext tool. It seems that the gotext tool is broken as well currently preventing me from trying localization. |
The gotext tool is under active development and one of the main focuses at the moment. Progress is a bit bursts, but definitely active. A documentation overhaul is part of that. |
@mpvl: I'm looking at this and deciding if I want to use this or something else and manually format numbers/money/date . My users are a bit peculiar. What they usually want is the same behavior as in os. The language is set to e.g English, but other formatting is based on the country. Or even better overridden via some settings page also just like in OS. |
x/text is very flexible with settings, although the use of it is somewhat
hidden.
Most settings are communicated through the language tags ("
golang.org/x/text/language".Tag). Language tags implement BCP 47 tags,
augmented with CLDR -u extensions
<http://www.unicode.org/reports/tr35/#Locale_Extension_Key_and_Type_Data>.
For instance:
- zh-u-co-stroke: sort using stroke sorting, instead of pinyin
- en-u-nu-fullwide: using fullwidth Latin numbers
- en-GB-u-rg-uszzzz: British English with American region settings.
The x/text packages that accept languages tags will extract the options
from these tags that are relevant to them. The language.Matcher preserves
these settings and also may add the -rg-xxxxxx tag when it notices the
dialect is different from the expressed region.
The reason for this indirect approach is that preferences, in practice, are
often expressed in language tags, for example through the "Accept-Language"
HTTP header. This approach allows these settings to pass through the
respective packages without the developer having to piece these out.
Anyway, if you rely on user preferences through languages tags, you don't
have to worry about these. If you want to explicitly create user
preferences, you can use the language.Tag's SetTypeForKey to create new
language tags with updated preferences.
…On Thu, 19 Jul 2018 at 08:13 Miha Vrhovnik ***@***.***> wrote:
@mpvl <https://github.com/mpvl>: I'm looking at this and deciding if I
want to use this or something else. My users are a bit peculiar. What they
usually want is the same behavior as in os. The language is set to e.g
English, but other formatting is based on the country. Or even better
overridden via some settings page also just like in OS.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#12750 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AGJZR9i7gapj34aoIL4mq5QEZlDeJrrIks5uIKIYgaJpZM4GD3-F>
.
|
Is there a recommended way to localize [text|html]/templates per the proposal? I like the idea proposed, it doesn't seem to be implemented yet. Is that the case? |
Not yet. There is a design for it, but it requires added functionality of
the core template libraries.
…On Tue, 26 Feb 2019 at 13:10 Eric Cox ***@***.***> wrote:
Is there a good way to mark text in go templates for translation?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#12750 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AGJZR7curUH0IDM4-4BJ1TG3b4n3Swq_ks5vRWqggaJpZM4GD3-F>
.
|
Is there any ETA for this feature or a suggested work-around? |
Hi, just found this issue and cross-posting my recent proposal: #34989 Are compact number formats something which could potentially fall under the responsibilities of the x/text package, and if so, what would be the process for creating a contribution to add this functionality? |
Anything part of Unicode, including CLDR fits in the x/text mandate. You could modify the existing package to include it. The same process as with Go applies. As that is CLDR 35, it would require an upgrade to CLDR 35 of x/text first, which may take some effort. |
Great, thank you. I'll look into the difficulty of getting that upgraded. In the meantime I put together a library which serves my purpose well enough for now (for anyone who happens to stumble upon this): https://github.com/nkall/compactnumber |
Hi, since that message is from February 2017, I would like to know: date/time localization is getting closer to be implemented, or is still far in Go roadmap? Thank you! |
Hi, the documentation of x/text is mentioning the gender feature in several places. Do I understand correctly, that this feature is currently not implemented? Thank you. |
@Xpert85 That is correct. |
I was hacking on my https://github.com/purpleidea/mgmt/ and it occurred to me that I'd like proper gettext support! Sadly, you can't have an underscore function:
But you can use two underscores! Sadly, the usefulness of this is not great, because if you stick that in a
it doesn't work because the function is seen as private, not public. My proposal: I'd like golang to consider treating the single underscore as a valid, public function. If that's too hard to do in the compiler, then to treat two underscores as a public function. This would go a long way into improving the readability of gettext translations in code =D Thanks! |
@purpleidea please file a separate proposal for that |
Just use T instead of _ as a function name. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Do we have the localisation support for date now? Or if anyone can suggest an alternate to have localisation support for dates. |
It would be nice to have date localization in x/text. In the meantime, between x/text and github.com/klauspost/lctime everything might be covered. I'm not sure why lctime is archived though.. perhaps that means it won't receive updates to the language rules. cc @klauspost |
@robfig It is a fork from a now deleted package by @variadico I mainly added a non-stateful interface so it could be used without affecting global package state. Since I don't really have intentions of maintaining/bugfixing this and there was a low user count, I thought it would be most fair to archive it. If someone wants to maintain a fork I'd be happy to link to it. |
|
@wolfgangmeyers It solves the problem of representing and formatting currency amounts for the time being. I'd love to see x/text maintained again though. |
This issue is intended as an umbrella tracking issue for localization support.
Localization support includes:
Details to be covered in design docs.
The text was updated successfully, but these errors were encountered: