forked from phacility/phabricator
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathinternationalization.diviner
381 lines (278 loc) · 13 KB
/
internationalization.diviner
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
@title Internationalization
@group developer
Describes Phabricator translation and localization.
Overview
========
Phabricator partially supports internationalization, but many of the tools
are missing or in a prototype state.
This document describes what tools exist today, how to add new translations,
and how to use the translation tools to make a codebase translatable.
Adding a New Locale
===================
To add a new locale, subclass @{class:PhutilLocale}. This allows you to
introduce a new locale, like "German" or "Klingon".
Once you've created a locale, applications can add translations for that
locale.
For instructions on adding new classes, see
@{article@phabcontrib:Adding New Classes}.
Adding Translations to Locale
=============================
To translate strings, subclass @{class:PhutilTranslation}. Translations need
to belong to a locale: the locale defines an available language, and each
translation subclass provides strings for it.
Translations are separated from locales so that third-party applications can
provide translations into different locales without needing to define those
locales themselves.
For instructions on adding new classes, see
@{article@phabcontrib:Adding New Classes}.
Writing Translatable Code
=========================
Strings are marked for translation with @{function@libphutil:pht}.
The `pht()` function takes a string (and possibly some parameters) and returns
the translated version of that string in the current viewer's locale, if a
translation is available.
If text strings will ultimately be read by humans, they should essentially
always be wrapped in `pht()`. For example:
```lang=php
$dialog->appendParagraph(pht('This is an example.'));
```
This allows the code to return the correct Spanish or German or Russian
version of the text, if the viewer is using Phabricator in one of those
languages and a translation is available.
Using `pht()` properly so that strings are translatable can be tricky. Briefly,
the major rules are:
- Only pass static strings as the first parameter to `pht()`.
- Use parameters to create strings containing user names, object names, etc.
- Translate full sentences, not sentence fragments.
- Let the translation framework handle plural rules.
- Use @{class@libphutil:PhutilNumber} for numbers.
- Let the translation framework handle subject gender rules.
- Translate all human-readable text, even exceptions and error messages.
See the next few sections for details on these rules.
Use Static Strings
==================
The first parameter to `pht()` must always be a static string. Broadly, this
means it should not contain variables or function or method calls (it's OK to
split it across multiple lines and concatenate the parts together).
These are good:
```lang=php
pht('The night is dark.');
pht(
'Two roads diverged in a yellow wood, '.
'and sorry I could not travel both '.
'and be one traveler, long I stood.');
```
These won't work (they might appear to work, but are wrong):
```lang=php, counterexample
pht(some_function());
pht('The duck says, '.$quack);
pht($string);
```
The first argument must be a static string so it can be extracted by static
analysis tools and dumped in a big file for translators. If it contains
functions or variables, it can't be extracted, so translators won't be able to
translate it.
Lint will warn you about problems with use of static strings in calls to
`pht()`.
Parameters
==========
You can provide parameters to a translation string by using `sprintf()`-style
patterns in the input string. For example:
```lang=php
pht('%s earned an award.', $actor);
pht('%s closed %s.', $actor, $task);
```
This is primarily appropriate for usernames, object names, counts, and
untranslatable strings like URIs or instructions to run commands from the CLI.
Parameters normally should not be used to combine two pieces of translated
text: see the next section for guidance.
Sentence Fragments
==================
You should almost always pass the largest block of text to `pht()` that you
can. Particularly, it's important to pass complete sentences, not try to build
a translation by stringing together sentence fragments.
There are several reasons for this:
- It gives translators more context, so they can be more confident they are
producing a satisfying, natural-sounding translation which will make sense
and sound good to native speakers.
- In some languages, one fragment may need to translate differently depending
on what the other fragment says.
- In some languages, the most natural-sounding translation may change the
order of words in the sentence.
For example, suppose we want to translate these sentence to give the user some
instructions about how to use an interface:
> Turn the switch to the right.
> Turn the switch to the left.
> Turn the dial to the right.
> Turn the dial to the left.
Maybe we have a function like this:
```
function get_string($is_switch, $is_right) {
// ...
}
```
One way to write the function body would be like this:
```lang=php, counterexample
$what = $is_switch ? pht('switch') : pht('dial');
$dir = $is_right ? pht('right') : pht('left');
return pht('Turn the ').$what.pht(' to the ').$dir.pht('.');
```
This will work fine in English, but won't work well in other languages.
One problem with doing this is handling gendered nouns. Languages like Spanish
have gendered nouns, where some nouns are "masculine" and others are
"feminine". The gender of a noun affects which article (in English, the word
"the" is an article) should be used with it.
In English, we say "**the** knob" and "**the** switch", but a Spanish speaker
would say "**la** perilla" and "**el** interruptor", because the noun for
"knob" in Spanish is feminine (so it is used with the article "la") while the
noun for "switch" is masculine (so it is used with the article "el").
A Spanish speaker can not translate the string "Turn the" correctly without
knowing which gender the noun has. Spanish has //two// translations for this
string ("Gira el", "Gira la"), and the form depends on which noun is being
used.
Another problem is that this reduces flexibility. Translating fragments like
this locks translators into a specific word order, when rearranging the words
might make the sentence sound much more natural to a native speaker.
For example, if the string read "The knob, to the right, turn it.", it
would technically be English and most English readers would understand the
meaning, but no native English speaker would speak or write like this.
However, some languages have different subject-verb order rules or
colloquisalisms, and a word order which transliterates like this may sound more
natural to a native speaker. By translating fragments instead of complete
sentences, you lock translators into English word order.
Finally, the last fragment is just a period. If a translator is presented with
this string in an interface without much context, they have no hope of guessing
how it is used in the software (it could be an end-of-sentence marker, or a
decimal point, or a date separator, or a currency separator, all of which have
very different translations in many locales). It will also conflict with all
other translations of the same string in the codebase, so even if they are
given context they can't translate it without technical problems.
To avoid these issues, provide complete sentences for translation. This almost
always takes the form of writing out alternatives in full. This is a good way
to implement the example function:
```lang=php
if ($is_switch) {
if ($is_right) {
return pht('Turn the switch to the right.');
} else {
return pht('Turn the switch to the left.');
}
} else {
if ($is_right) {
return pht('Turn the dial to the right.');
} else {
return pht('Turn the dial to the left.');
}
}
```
Although this is more verbose, translators can now get genders correct,
rearrange word order, and have far more context when translating. This enables
better, natural-sounding translations which are more satisfying to native
speakers.
Singular and Plural
===================
Different languages have various rules for plural nouns.
In English there are usually two plural noun forms: for one thing, and any
other number of things. For example, we say that one chair is a "chair" and any
other number of chairs are "chairs": "0 chairs", "1 chair", "2 chairs", etc.
In other languages, there are different (and, in some cases, more) plural
forms. For example, in Czech, there are separate forms for "one", "several",
and "many".
Because plural noun rules depend on the language, you should not write code
which hard-codes English rules. For example, this won't translate well:
```lang=php, counterexample
if ($count == 1) {
return pht('This will take an hour.');
} else {
return pht('This will take hours.');
}
```
This code is hard-coding the English rule for plural nouns. In languages like
Czech, the correct word for "hours" may be different if the count is 2 or 15,
but a translator won't be able to provide the correct translation if the string
is written like this.
Instead, pass a generic string to the translation engine which //includes// the
number of objects, and let it handle plural nouns. This is the correct way to
write the translation:
```lang=php
return pht('This will take %s hour(s).', new PhutilNumber($count));
```
If you now load the web UI, you'll see "hour(s)" literally in the UI. To fix
this so the translation sounds better in English, provide translations for this
string in the @{class@phabricator:PhabricatorUSEnglishTranslation} file:
```lang=php
'This will take %s hour(s).' => array(
'This will take an hour.',
'This will take hours.',
),
```
The string will then sound natural in English, but non-English translators will
also be able to produce a natural translation.
Note that the translations don't actually include the number in this case. The
number is being passed from the code, but that just lets the translation engine
get the rules right: the number does not need to appear in the final
translations shown to the user.
Using PhutilNumber
==================
When translating numbers, you should almost always use `%s` and wrap the count
or number in `new PhutilNumber($count)`. For example:
```lang=php
pht('You have %s experience point(s).', new PhutilNumber($xp));
```
This will let the translation engine handle plural noun rules correctly, and
also format large numbers correctly in a locale-aware way with proper unit and
decimal separators (for example, `1000000` may be printed as "1,000,000",
with commas for readability).
The exception to this rule is IDs which should not be written with unit
separators. For example, this is correct for an object ID:
```lang=php
pht('This diff has ID %d.', $diff->getID());
```
Male and Female
===============
Different languages also use different words for talking about subjects who are
male, female or have an unknown gender. In English this is mostly just
pronouns (like "he" and "she") but there are more complex rules in other
languages, and languages like Czech also require verb agreement.
When a parameter refers to a gendered person, pass an object which implements
@{interface@libphutil:PhutilPerson} to `pht()` so translators can provide
gendered translation variants.
```lang=php
pht('%s wrote', $actor);
```
Translators will create these translations:
```lang=php
// English translation
'%s wrote';
// Czech translation
array('%s napsal', '%s napsala');
```
(You usually don't need to worry very much about this rule, it is difficult to
get wrong in standard code.)
Exceptions and Errors
=====================
You should translate all human-readable text, even exceptions and error
messages. This is primarily a rule of convenience which is straightforward
and easy to follow, not a technical rule.
Some exceptions and error messages don't //technically// need to be translated,
as they will never be shown to a user, but many exceptions and error messages
are (or will become) user-facing on some way. When writing a message, there is
often no clear and objective way to determine which type of message you are
writing. Rather than try to distinguish which are which, we simply translate
all human-readable text. This rule is unambiguous and easy to follow.
In cases where similar error or exception text is often repeated, it is
probably appropriate to define an exception for that category of error rather
than write the text out repeatedly, anyway. Two examples are
@{class@libphutil:PhutilInvalidStateException} and
@{class@libphutil:PhutilMethodNotImplementedException}, which mostly exist to
produce a consistent message about a common error state in a convenient way.
There are a handful of error strings in the codebase which may be used before
the translation framework is loaded, or may be used during handling other
errors, possibly raised from within the translation framework. This handful
of special cases are left untranslated to prevent fatals and cycles in the
error handler.
Next Steps
==========
Continue by:
- adding a new locale or translation file with
@{article@phabcontrib:Adding New Classes}.