-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes for internationalization #825
Conversation
|
||
export interface TranslatableString { | ||
key : string; | ||
context ?: string; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of context?
Is it to distinguish between similar inputs that need to be translated differently depending on the function/ domain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. gettext either takes a single string, or a pair of string and context. documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ik I'm trying to understand where we need context. Do you have an example in "it.po" where msgid is the same but msgstr is different?
} | ||
} | ||
|
||
function makeChoice(choices : Replaceable[]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this to handle multiple canonical annotations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we fold everything into a single string, so translators can add/remove variants.
msgid "{them [plural=other]|it [plural=one]}" | ||
msgstr "" | ||
"{loro [plural=other]|lui [plural=one,gender=m]|lei [plural=one,gender=f]}" | ||
|
||
#: lib/utils/thingtalk/describe.ts:258 lib/utils/thingtalk/describe.ts:517 | ||
#: lib/utils/thingtalk/describe.ts:257 lib/utils/thingtalk/describe.ts:516 | ||
msgid "the ${name} [plural=name[plural]]" | ||
msgstr "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite understand what's happening here. Are you adding all possible variations with different gender/ plural forms? Are these two the only attributes that changes the output? How about weird casing of words in German that changes depending on their position in the sentence?
Also, a general question: does PO allows multiple translations for a single input?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you're supposed to add all the variations. In Italian gender/number are the only variations for nouns. In German, IIRC, case only matters for articles, but yeah you would have that too. German will need to disable lower casing because nouns are capitalized, but other than that it should work the same way.
PO as a file format specifies one translation per translatable string. In the way we use PO though, the translation has a specific format (genie template strings) and we can have multiple variants.
Translating canonical forms one by one doesn't make sense. Instead, collapse all canonical forms for a given parameter/function into a single translatable string, so translators can add/remove phrases as they need.
Factor out the code to walk annotations into a common module, and use (p)gettext consistently to merge the annotations. This ensures that the translations are actually compatible with the tools that process PO files.
Using the "enum_value" or "value_enum" keys of the #_[canonical] annotation of a parameter that uses the type. Because enums are used interchangeably on all parameters with the same type, the same canonical must be used on all parameters, and it is unspecified which one will be used otherwise.
Default them, if necessary
"filter" is a keyword in ThingTalk so we can't actually write `#_[canonical={ filter= ... }]`
Normalize locale from gettext (POSIX) to BGP 47 format
So we can use it from thingpedia-common-devices
ff19208
to
22a72ec
Compare
Btw, one problem with this PR that I noticed from the automatic translation tests is that the automatic translation tries to translate the flags as well. Can we handle that in a smarter way somehow? @Mehrad0711 |
What do you mean by flags? |
The part in the square brackets sets different flags on each phrase, for pos, plural, gender, etc. |
But we're already doing that: genie-toolkit/scripts/po_edit.py Line 52 in b68f2e2
In previous po files I didn't see "gender". There were a few "pos" but those are the lines we don't translate. If your changes are adding lines with new constructs or flags we should just include it in that regex. |
pos is definitely one, and it's definitely going in a lot of strings that we do need to translate. But we should make this regex generic to any flag. |
Yeah feel free to fix it and check the rest. There might be other constructs I'm unaware of that we miss there. |
Update expected manifest, fix translating parts that should not be translated
This is on top of #823
It changes how we extract translations from manifests and how we merge them back. The changes are necessary to actually be able to translate correctly and with proper grammar.
It also implements canonical forms of enum values, which was a missing piece for translation.