Skip to content

Localization

Ken Manheimer edited this page Mar 19, 2015 · 29 revisions

SpiderOak Mobile Client App Localization Maintenance

Table of Contents generated with DocToc

Activity-oriented Instructions

We use html10n.js to implement our app's localization. The following sections are organized around specific app maintenance tasks which involve app literal strings, and describe how to use our localization provisions properly specifically for those tasks.

Editing app strings

In our app, localizing strings entails:

Editing app literals

Editing app literals - Examples

Adding or editing localized strings is slightly different in code versus in templates, but we use the same approach for both. These examples embody the essentials, for the impatient.

Localized string in code Localized string in templates
qq("Server changed to [[server]]",
   {"server": standardServer});
<div class="title">
  {{= qq("About [[SpiderOak]]",
         {"SpiderOak": s("SpiderOak")})
  }}
</div>
Spanish rendering:
Servidor cambiado a spideroak.com
Spanish rendering:
Acerca de SpiderOak Blue
  • Based on the Spanish translation locale file mapping the key "Server changed to [[server]]" to "Servidor cambiado a {{server}}"
  • and standardServer having the run-time value "spideroak.com" when the expression is evaluated
  • We do explicit API qq() calls within doT.js's "evaluate this" {{= / }} brackets, rather than using the library package's tag-attributes method.
  • The custom-brand-resolved string, s("SpiderOak"), is included by being passed as the value of the {{SpiderOak}} substitution parameter. It resolves in the Blue build to "SpiderOak Blue".
  • Editing app literals - Details

    If you're going to edit app literals, you need to be familiar with all this:

    • qq() is our alias for the html10n.get() API call.
    • We use this explicit qq() API call even in templates
      • to avoid the mess of interleaving doT.js and html10n.js template syntax.
    • qq()'s first argument is the string to be localized. It's a key which maps to the localized translation of the string.
    • qq()'s key string can include programmatically substituted terms, distinguished by surrounding [[...]] brackets.
      • Accepted characters: A-Z, a-z, _ underscore, - dash, and . doT.
        • Terms surrounded by brackets but including other characters are not recognized for substitution.
        • In particular, special terms cannot contain any whitespace.
      • The bracketed terms are satisfied from either:
        • a mapping passed as the second argument to qq, or by
        • recursive translation of the special term, itself.
      • Translators must not translate the terms within brackets.
        • But the translated strings must use {{...}} curly braces instead of square brackets.
        • The translated-string curly braces are where html10n.js actually doess the substitutions of the special terms.
      • Hence:
        • In code and in the localization files, all the translation key strings use [[...]] square brackets to distinguish special terms
        • While the localized target strings, and only those, use {{...}} curly brace pairs.
      • The bracketed term substitutions serve a few distinct purposes:
        • To access program state, like server and statusCode, pass the value as the localization parameter value.
          • E.g.: qq("Service host changed to [[server]]", {"server": newServer})
        • To access brand-specific customizations, like getting the app's brand-specific name in place of "SpiderOak": pass the brand-substitution value, s(...), as the parameter value.
          • E.g.: qq("Learn more about [[SpiderOak]]&raquo;", {SpiderOak: s("SpiderOak")})
        • To clearly signal to translators the trade strings that should not be translated, like ShareRoom and Zero-Knowledge, bracket them.
          • English-to-English entries for these non-translated strings are collected in a special JSON localization file. This file is not passed to the translators, keeping these localizations English-only - see JSON source files.
          • (For very different languages, like ideograph-based ones, we could provide more natively recognizable, language-specific refinements.
            • E.g. so they translate to a phrase that includes both the English version and a language-specific equivalent - like, for Chinese, Zero-Knowledge (零知识).)

    The html10n javascript API instructions has further details on our translation machinery.

    Maintaining Translations

    We mostly only make changes to the en-GB (www/locales/en-GB.json) and - for just those strings that have a different US-English translation - in the en-US (www/locales/en-US.json) JSON source files, and leave it to the translators to update the respective locale-specific mappings. We also maintain non-translated strings (typically, terms of trade) in www/nontranslate.json. We occasionally have reason to edit the non-English locale mappings.

    In any of these cases, our translators will track the changes in the derived XLIFF externalizations that we send, and provide updates for all the affected strings.

    When editing the JSON sources:

    • The translators are tracking differences from prior versions, so we can change existing entries in-place, to communicate revisions.
    • Add new items just before the last, "NO TRAILING COMMA" item
      • Use the same structure as the prior entries (including trailing comma)
    • To change existing items, edit them in place
    • The translation library will fail in the face of invalid JSON
      • When necessary, verify the syntax by pasting a copy of the file into an online JSON validator, like JSON Lint
      • If a JSON syntax error does sneak through:
        • You'll see some or all of:
          • undefined showing up in the interface in lots of places, in the stead of proper app strings
          • Console log messages starting with Uncaught SyntaxError: ...
          • Numerous console log messages: No translations available (yet)
        • If you see any of these, use JSON Lint to find the error.
    • Make sure you're using the right characters for bracketed terms in source ([[...]]) vs. target ({{...}}) strings, for which see Editing app literals - Details.
    • Once you do get the updates arranged, check them in and go through the process to Convey current versions of strings for translation

    See instead Establish a new translation locale, for initiating a new, separate translation locale.

    Establish a new translation locale

    Mostly, we just ask the translators to provide a new locale, based on our English master (en-GB). Case #2 describes an existing exception, which can serve as a model if we eventually implement others.

    1. When adding a new, completely distinct locale, we:
      • Determine the proper code for the locale. (Use the unqualified code, if you can. Reasons to use a country-qualified code may be best handled by case #2.)
      • Ask the translator to send a new XLIFF file for the language code
      • Register the derived JSON file in the master index
      • (Ensure that local changes to the translation files are committed or otherwise preserved.)
      • Include the new XLIFF .xml file in the xliff subdir
      • Run grunt xliff:to_json to produce the derived JSON file.
        • Also run grunt xliff:from_json before committing changes, to normalize the XLIFF version.
      • Confirm then commit the changes.
    2. Sometimes, the new locale is just a minor variant on an existing one, with only a few strings that need to be maintained separately. This is so for the en-US variant of the Commonwealth English base, en-GB.
      • Decide which variant is going to serve as the primary
        • Register that variant in www/i18n.json under the unqualified code
        • Register the secondary under its explicitly qualified code.
      • For example, we might register www/locales/es-ES => es and www/locales/es-MX => es-MX.
      • Include in the secondary locale's JSON file only those strings which are different than those in the primary.

    Register a new translation file in the app's master index

    In either case, the new files will be situated in www/locales and:

    • entries for the new file need to be included in the master index, www/i18n.json. Follow the conventions for the existing entries in the new one you create.
    • How language-variant fall-throughs work:
      • Keys that are unqualified codes inherently serve as fall-throughs for any qualified codes that have the unqualified code as their base. Eg:
        • "en" => "locales/en-GB.json"
        • "en-US" => "locales/en-US.json"
        • Thereby, the fall-through for en-US = en => "locales/en-GB.json"
      • At least initially, however, most keys are unqualified codes that go to the only variants we have for those languages.

    Revise how app locality is detected

    See the code described in Translation facility: html10n.js.

    Coordinate with translators

    We share our translation data with the translators using XLIFF, and use grunt commands to produce XLIFF externalizations from our JSON sources, and vice versa, to incorporate XLIFF updates from the translators by converting them back to JSON. The instructions below detail procedures using those commands, and include guidance on using version control to best track our changes and avoid inadvertent losses. See Conveying translation updates to and from the Translators for details.

    As with most application translations, some of our strings include special terms that have to be treated specially by the translators. The next section, Instructions for the translators, details what the translators need to do to preserve these special strings.

    We also have a translation file for non-translated terms, but it is kept separate from the localized files. It is maintained internally, not passed to the translators at all, unless they ask for that list of terms. (They mostly shouldn't need it, since they only see those strings within source strings in brackets. All bracketed terms are to be conveyed unchanged within translation targets.)

    Instructions for the translators

    • We will provide the translations we already have as XLIFF .xml files
    • The parts of the strings that should not be changed in the translation:
      • HTML syntax (tags, tag attributes, etc) should be preserved as-is
      • The terms within "..." square brackets should be preserved in the source strings, and the translated strings should include the same, untranslated terms, but the brackets surrounding the terms in the translated strings instead be "{{...}}" curly braces.
      • Example, in Spanish:
        • Source: Access to link [[url]] failed: [[statusText]] ([[statusCode]])
        • Target: El acceso al enlace {{url}} falló: {{statusText}} ({{statusCode}})
    • When in doubt, please see already done items for plenty of examples, or feel free to ask us for guidance.

    Conveying translation updates to and from the Translators

    Both sending changes to the translators and receiving updates from them depend on grunt commands we've implemented for converting JSON to XLIFF and back. In both cases we do "round-trip" conversions, from JSON to XLIFF and back, or vice versa. We do this to "normalize" the layout of the data files for checking to version control, to avoid having spurious differences that obscure salient ones.

    Convey current versions of strings for translation

    Our suggested process actually includes two commits for each dispatch to translators, the first to prevent inadvertent loss of changes and the subsequent one to include the final, normalized versions of the changed files.

    So, when you have a new version of the app strings ready for dispatch to the translators:

    Be sure that any XLIFF changes, from the translators, are already processed and committed before doing this.

    1. Do a commit, confirming that it includes your intended JSON file revisions.
    2. Run grunt xliff:from_json to convey the current set of JSON locales, www/locales/*.json, to the XLIFF equivalents, in xliff/*.xml.
      • You could check that the XLIFF files change as expected, though that can be obscured by numerous incidental changes, since the XLIFF process maintains sequence numbers that may change spuriously.
    3. Run grunt xliff:to_json to complete the "round-trip" back onto the original JSON files. This normalizes the JSON files layout to that enforced by the xliff conversion library.
    4. Do a git commit to check in the XLIFF revisions and normalized JSON files.
    5. Send the entire set of XLIFF .xml files to the translators. They are responsible for tracking the changes in the en-GB.xml file and conveying those changes to the various translations.

    Receive and incorporate translated strings from translators

    We should receive revisions from the translators in the form of XLIFF .xml files, which we situate in the xliff/ subdirectory and then convert using a custom grunt command to the corresponding JSON files using, for consumption by the html10n.js localization library.

    Be sure that any locally originated string changes are preserved - eg, by git commit or stashing - before placing updated XLIFF files from translators.

    1. Situate the recieved XLIFF .xml files in the xliff/ subdirectory.
      • You can examine the changes by doing a git diff, but may see numerous spurious diffs if a the sequence numbering shifts.
    2. Run grunt xliff:to_json to derive the corresponding JSON files, in www/locales/*.json.
      • Now you should be able to examine exactly what has changed using git diff, since there's no sequence numbering and few other incidental artifacts in the files. Order changes or other spurious reorganizations should be rare.
      • If the xliff-to-json conversion process fails
        • Use a process of elimination to identify which XLIFF file(s) have faulty syntax
        • Run the faulty files through an XML lint processor to find the error.
        • Report the errors to the translators, so they can shake problems out of their process!
    3. Test the changes by running a new build on a device where you can set the language.
      • You can do so using a security-inhibited browser run of the app - see the debugging note in html10n.js hookup.
    4. Run grunt xliff:from_json to complete the "round-trip" back onto the XLIFF files received from the translators, to normalize them.
    5. Do a git commit to check in the revisions.

    Whee!

    Implementation Specifics

    Translation facility: html10n.js

    • html10n.js

    • our own fork, for bug fixes and custom tailoring

    html10n.js hookup

    • Our code for html10n.js in src/helpers/localizer.js
      • window.localizer.prepareHtml10n() does most of the work
        • It invoked early in spiderOakApp.ready()
        • Registers the active locales, via html10n.localize()
        • Binds a "go" function to trigger on html10n localized event
          • including setting the moment.js locale
          • and setting some critical document locale characteristics
    • debugging: can use browser developer console to change locale on the fly:
      • E.g., while app is running: html10n.localize(["ru", "en", "nontranslate"])
        • The argument must be an array.
        • The first array element is for the locality, the second is the fall-through language, and the third is for the non-translated trade-terms, like Zero-Knowledge.
      • A few things will not be re-rendered until traversing to the login page:
        • the menu sheet
        • the preliminary page
      • some things, like login page, will re-render immediately
      • some will require leaving the page and returning

    possible alternative - polyglot.js

    AirBnb's polyglot.js looks like a simpler alternative.

    • It doesn't do the json file loading or include indirection that we get with html10n.js
    • Initialization looks simpler, which may be a plus and a minus - no inherent event awareness.
    • html10n.js seems to be working, ok, with our fixes, so this note is just in case of trouble.

    Language code provisions

    The app's locale configuration has some important intricacies.

    The html10n.localize() array argument provides, in effect, a "search-path" for resolving localized strings, with the prospective sources earlier in the path taking precedence over the latter ones. (In actuality, the library accumulates a mapping from translation keys to targets, going in reverse order over the array of codes so that translation entries from items earlier in the array take precedence.)

    We prime the html10n machinery (see html10n.js hookup) to use the locale (as reported by the browser), if available, and fall-through to en-GB. Strings not provided by one translation, in a particular session, may be provided by one that has lower precedence in that session. Some strings are only provided by a special pseudo-translation, included in all sessions.

    Local load order
    1. The first-priority locale is the one identified by either navigator.language or navigator.userLanguage, in that order.
    2. Any prospective locales for which we have no translation are skipped. However, if the skipped code is a qualified one (e.g., es-MX), then, by #3, the unqualified code (e.g. es) is still tried.
    3. Any prospective local that has the form of a qualified code - for example, en-GB - is implicitly followed by the unqualified language code.
      • Thus, e.g., es-ES and es-MX will both also include es.
    4. Finally, www/nontranslate.json is included at the end of all translation source load paths, so the (identity) resolutions of the strings are available in all cases.

    Item 3, implicitly including the non-qualified version of a locale, in effect immediately after the qualified version, enables some important features:

    • Because of this, those locales for which we have only one variant are registered under just the unqualified code. At least in this early stage, that applies to most of our locale entries: de, es, fr, pt, and ru.
    • Only our English variants are currently qualified, because we have variants within a major (en) code.
    • Since more English variants derive from the "Commonwealth English" en-GB, that is our fall-through locale.
      • Thus, the en-GB fall-through is used for any locations for which we lack any other translation.
      • Plus, our en-US translation actually only includes the strings that differ from the en-GB tranlsation, since the en-GB translation, as fall-through, will be used for any strings not satisified by en-US.
    • Other, non-English translations with multiple variants can use a similar scheme, where the full translation occupies the unqualified slot, and the minor variant is registered with full qualification. For example, if es-MX has just a few differences from es-ES, then es-ES could be registered as just es, and the es-MX need include only the strings that are distinct from es-ES. Since es is implicitly included immediately after es-MX, then the items missing from es-MX will be resolved via es ==> es-ES.

    Translation Files

    JSON source files

    • Master file: www/i18n.json
      • Index that identifies mappings from locale to actual files.
      • In addition to the basic mappings, we use those mappings to identify the primary translation for a locale that has varying country-qualified codes.
        • For example, en-GB is the preferred code for en.
        • This means that we don't have to provide individual aliases for, e.g., the many commonwealth English variants, like en-AU, en-CA, ..., since en is implicitly included, after the country-qualified code, automatically.
        • And non-commonwealth English variants - en-US - needs to only contain the strings that are different from those in the en-GB collection. Those that are the same will be gotten from the en entry that is implicitly included, automatically, after the en-US entry.
    • The .json locale-specific files, used by the app to map app string keys to locality translations
      • One locale per file
      • Located in www/locales/*.json
      • Each file includes a JSON object which maps the English key strings to the respective target locale strings.
        • The source and target strings can include bracketed terms, which are to be preserved, unchanged, in the target strings, as described in the Editing app literal - Details section.
      • In general we edit only the en-GB.json and en-US.json files. The other locales files are derived from XLIFF files which we get from the translators.
        • (We do make changes to other JSON locale files when we are alerted to corrections. For them, we need to do a round-trip to XLIFF and back, to comprehensively check in the changes.)

    JSON source string nuances:

    • doT.js requires that "/" forward slash is escaped with "" backslash in templates,

      • but those "" backslashes aren't seen in the JSON externalizations
      • and so aren't needed at all in the source files.
      • good thing, because e.g. xliff conversions necessarily drop them,
      • which would lead to round-trip discrepancies if they're included in the JSON versions.
    • Non-translated strings file: www/nontranslate.json

      • Any strings which are not supposed to have non-English translations
        • E.g. Zero-Knowledge and ShareRoom
      • Generally, terms of trade.
      • This file consists of entries for these strings, with the translation being identical to the string itself.
      • This file is not included among those from which the translator's XLIFF files are derived.
      • The file is included as the last item on the html10n.localize() array, so that bracketed references to the non-translated strings are resolved from it.
      • Hence we don't have to warn the translators not to make exceptions for the non-translated strings - the only instances they see of the strings are bracketed, and they're generally not supposed to translate bracketed strings.

    XLIFF externalization for translator coordination

    • Using grunt-xliff.
    • Grunt commands to convert JSON to and from XLIFF, XLIFF files to xliff/
      • See xliff/README.md for basic usage instructions
      • including foilbles:
        • The 'languages' target doesn't work, as far as I can tell.
        • But that's just as well, for our concerns.
    • To produce XLIFF files for translators:

      • grunt xliff:from_json
      • Derive xliff/*.xml, one for each .json in www/locales/*.json
      • BEWARE that this will overwrite the corresponding .xml files in xliff - be sure you've incorporated the translators last batch of changes before doing this.
      • Any time you derive the XLIFF files from the JSON ones:
        • Before checking in the JSON changes, do the reverse conversion, from XLIFF to JSON (see the next superior outline topic), in order to normalize the JSON sources to the layout produced by the XLIFF conversion. This will reduce superfluous change records.
        • Check-in the changes to the XLIFF files, preferably along with checking in the corresponding JSON changes, to keep their histories in sync.
    • To incorporate changes in XLIFF files received from translators:

      • grunt xliff:from_json
      • Derive www/locales/*.json, one for each xliff/*.xml
      • BEWARE that this will overwrite the corresponding .json files in www/locales/*.json
        • This means you really want to "round-trip" (see above) and check in any JSON source changes made locally before starting to incorporate changes from translators
        • ... so that we have clear audit-trail for internally originated and externally received changes
        • ... and can thereby clearly reconcile collisions, when necessary.
    • The XLIFF .xml files, for coordinating translations with the translators.

      • Located in xliff subdir
      • Derived from JSON versions using grunt xliff:from_json
      • Produce .json files from XLIFF using grunt xliff:to_json
        • This will overwrite the JSON versions, so be sure the JSON content is already committed, or already represented in the XLIFF versions (eg, for a round-trip normalization, see next).
      • "Round trip normalization": Before committing .json changes, e.g. to the English strings, do a round trip conversion from the JSON files to the XLIFF .xml files then back to JSON.
        • This way, you can check in the corresponding XLIFF changes simultaneously
        • In the process you also normalize the JSON files to the output produced by the XLIFF-to-JSON conversion, thereby avoiding subsequent spurious whitespace differences for the hand-edited lines in the JSON versions.