From 5dbaee747adc7d4b67a0b4be8e75243140d77298 Mon Sep 17 00:00:00 2001 From: Sean Hammond Date: Wed, 2 Apr 2014 21:10:27 +0200 Subject: [PATCH] [#809] Add i18n guide for devs Fixes #809 --- doc/_themes/sphinx-theme-okfn | 1 + doc/contributing/frontend/index.rst | 7 +- doc/contributing/html.rst | 6 +- doc/contributing/i18n.rst | 5 + doc/contributing/index.rst | 1 + doc/contributing/javascript.rst | 5 + doc/contributing/python.rst | 5 + doc/contributing/string-i18n.rst | 396 ++++++++++++++++++++++++++++ doc/extensions/best-practices.rst | 8 + doc/theming/best-practices.rst | 10 +- doc/theming/javascript.rst | 5 + doc/theming/templates.rst | 5 + 12 files changed, 447 insertions(+), 7 deletions(-) create mode 160000 doc/_themes/sphinx-theme-okfn create mode 100644 doc/contributing/string-i18n.rst diff --git a/doc/_themes/sphinx-theme-okfn b/doc/_themes/sphinx-theme-okfn new file mode 160000 index 00000000000..4628f26abf4 --- /dev/null +++ b/doc/_themes/sphinx-theme-okfn @@ -0,0 +1 @@ +Subproject commit 4628f26abf401fdb63ec099384bff44a27dcda4c diff --git a/doc/contributing/frontend/index.rst b/doc/contributing/frontend/index.rst index 7232c736d46..f64a9f723f1 100644 --- a/doc/contributing/frontend/index.rst +++ b/doc/contributing/frontend/index.rst @@ -11,6 +11,11 @@ Frontend development guidelines template-blocks javascript-module-tutorial +.. seealso:: + + :doc:`/contributing/string-i18n` + How to mark strings for translation. + ----------------------------- Install frontend dependencies ----------------------------- @@ -313,7 +318,7 @@ useful in the future. Internationalization ==================== -All strings within modules should be internationalised. Strings can be +All strings within modules should be internationalized. Strings can be set in the ``options.i18n`` object and there is a ``.i18n()`` helper for retrieving them. diff --git a/doc/contributing/html.rst b/doc/contributing/html.rst index dc850adb5d7..b7e88d19919 100644 --- a/doc/contributing/html.rst +++ b/doc/contributing/html.rst @@ -2,6 +2,11 @@ HTML coding standards ===================== +.. seealso:: + + :doc:`string-i18n` + How to mark strings for translation. + ---------- Formatting ---------- @@ -151,4 +156,3 @@ And **not**: ::

Blah foo blah New paragraph, blah

- diff --git a/doc/contributing/i18n.rst b/doc/contributing/i18n.rst index c5895c1383f..b2c7baba6b5 100644 --- a/doc/contributing/i18n.rst +++ b/doc/contributing/i18n.rst @@ -6,6 +6,11 @@ CKAN is used in many countries, and adding a new language to the web interface i CKAN uses the url to determine which language is used. An example would be ``/fr/dataset`` would be shown in french. If CKAN is running under a directory then an example would be ``/root/fr/dataset``. For custom paths check the :ref:`ckan.root_path` config option. +.. seealso:: + + Developers, see :doc:`string-i18n` for how to mark strings for translation in + CKAN code. + .. Note: Storing metadata field values in more than one language is a separate topic. This is achieved by storing the translations in extra fields. A custom dataset form and dataset display template are recommended. Ask the CKAN team for more information. ------------------- diff --git a/doc/contributing/index.rst b/doc/contributing/index.rst index 58960522c27..f674454fa1c 100644 --- a/doc/contributing/index.rst +++ b/doc/contributing/index.rst @@ -32,6 +32,7 @@ of contributions to CKAN: html javascript python + string-i18n testing frontend/index diff --git a/doc/contributing/javascript.rst b/doc/contributing/javascript.rst index cb4a3a8798f..17a7f1cb36b 100644 --- a/doc/contributing/javascript.rst +++ b/doc/contributing/javascript.rst @@ -2,6 +2,11 @@ JavaScript coding standards =========================== +.. seealso:: + + :doc:`string-i18n` + How to mark strings for translation. + ---------- Formatting ---------- diff --git a/doc/contributing/python.rst b/doc/contributing/python.rst index ed396398f38..8dc6c1f5c17 100644 --- a/doc/contributing/python.rst +++ b/doc/contributing/python.rst @@ -11,6 +11,11 @@ Some good links about Python code style: - `Python Coding Standards `_ from Yahoo - `Google Python Style Guide `_ +.. seealso:: + + :doc:`string-i18n` + How to mark strings for translation. + Use single quotes ----------------- diff --git a/doc/contributing/string-i18n.rst b/doc/contributing/string-i18n.rst new file mode 100644 index 00000000000..6909e33c5de --- /dev/null +++ b/doc/contributing/string-i18n.rst @@ -0,0 +1,396 @@ +=========================== +String internationalization +=========================== + + +All user-facing Strings in CKAN Python, JavaScript and Jinja2 code should be +internationalized, so that our translators can then localize the +strings for each of the many languages that CKAN supports. This guide shows +CKAN developers how to internationalize strings, and what to look for regarding +string internationalization when reviewing a pull request. + +.. note:: + + *Internationalization* (or i18n) is the process of marking strings for + translation, so that the strings can be extracted from the source code and + given to translators. + *Localization* (l10n) is the process of translating the marked strings into + different languages. + +.. seealso:: + + :doc:`i18n` + If you want to translate CKAN, this page documents + the process that translators follow to localize CKAN into different + languages. + + :doc:`release-process` + The processes for extracting internationalized strings from CKAN and + uploading them to Transifex to be translated, and for downloading the + translations from Transifex and loading them into CKAN to be displayed + are documented on this page. + +.. note:: + + Much of the existing code in CKAN was written before we had these + guidelines, so it doesn't always do things as described on this page. + When writing new code you should follow the guidelines on this page, not the + existing code. + + +------------------------------------------------ +Internationalizating strings in Jinja2 templates +------------------------------------------------ + +Most user-visible strings should be in the Jinja2 templates, rather than in +Python or JavaScript code. This doesn't really matter to translators, but it's +good for the code to separate logic and content. Of course this isn't always +possible. For example when error messages are delivered through the API, +there's no Jinja2 template involved. + +The preferred way to internationalize strings in Jinja2 templates is by using +`the trans tag from Jinja2's i18n extension `_, +which is available to all CKAN core and extension templates and snippets. + +Most of the following examples are taken from the Jinja2 docs. + +To internationalize a string put it inside a ``{% trans %}`` tag: + +.. code-block:: jinja + +

{% trans %}This paragraph is translatable.{% endtrans %}

+ +You can also use variables from the template's namespace inside a +``{% trans %}``: + +.. code-block:: jinja + +

{% trans %}Hello {{ user }}!{% endtrans %}

+ +(Only variable tags are allowed inside trans tags, not statements.) + +You can pass one or more arguments to the ``{% trans %}`` tag to bind variable +names for use within the tag: + +.. code-block:: jinja + +

{% trans user=user.username %}Hello {{ user }}!{% endtrans %}

+ + {% trans book_title=book.title, author=author.name %} + This is {{ book_title }} by {{ author }} + {% endtrans %} + +To handle different singular and plural forms of a string, use a ``{% pluralize +%}`` tag: + +.. code-block:: jinja + + {% trans count=list|length %} + There is {{ count }} {{ name }} object. + {% pluralize %} + There are {{ count }} {{ name }} objects. + {% endtrans %} + +(In English the first string will be rendered if ``count`` is 1, the second +otherwise. For other languages translators will be able to provide their own +strings for different values of ``count``.) + +The first variable in the block (``count`` in the example above) is used to +determine which of the singular or plural forms to use. Alternatively you can +explicitly specify which variable to use: + +.. code-block:: jinja + + {% trans ..., user_count=users|length %} + ... + {% pluralize user_count %} + ... + {% endtrans %} + +The ``{% trans %}`` tag is preferable, but if you need to pluralize a string +within a Jinja2 expression you can use the ``_()`` and ``ungettext()`` +functions: + +.. code-block:: jinja + + {% set hello = _('Hello World!') %} + +To use variables in strings, use Python `format string syntax`_ +and then call the ``.format()`` method on the string that ``_()`` returns: + +.. _format string syntax: https://docs.python.org/2/library/string.html#formatstrings + +.. code-block:: jinja + + {% set hello = _('Hello {name}!').format(name=user.name) %} + +Singular and plural forms are handled by ``ungettext()``: + +.. code-block:: jinja + + {% set text = ungettext( + '{num} apple', '{num} apples', num_apples).format(num=num_apples) %} + +.. note:: + + There are also ``gettext()`` and ``ngettext()`` functions available to + templates, but we recommend using ``_()`` and ``ungettext()`` for + consistency with CKAN's Python code. + This deviates from the Jinja2 docs, which do use ``gettext()`` and + ``ngettext()``. + + ``_()`` is not an alias for ``gettext()`` in CKAN's Jinja2 templates, + ``_()`` is the function provided by Pylons, whereas ``gettext()`` is the + version provided by Jinja2, their behaviors are not exactly the same. + + +----------------------------------------- +Internationalizing strings in Python code +----------------------------------------- + +CKAN uses the :py:func:`~pylons.i18n._` and :py:func:`~pylons.i18n.ungettext` +functions from the `pylons.i18n.translation`_ module to internationalize +strings in Python code. + +.. _pylons.i18n.translation: http://docs.pylonsproject.org/projects/pylons-webframework/en/latest/modules/i18n_translation.html#module-pylons.i18n.translation + +Core CKAN modules should import :py:func:`~ckan.common._` and +:py:func:`~ckan.common.ungettext` from :py:mod:`ckan.common`, +i.e. ``from ckan.common import _, ungettext`` +(don't import :py:func:`pylons.i18n.translation._` directly, for example). + +CKAN plugins should import :py:mod:`ckan.plugins.toolkit` and use +:py:func:`ckan.plugins.toolkit._` and +:py:func:`ckan.plugins.toolkit.ungettext`, i.e. do +``import ckan.plugins.toolkit as toolkit`` and then use ``toolkit._()`` and +``toolkit.ungettext()`` (see :doc:`/extensions/plugins-toolkit`). + +To internationalize a string pass it to the ``_()`` function: + +.. code-block:: python + + my_string = _("This paragraph is translatable.") + +To use variables in a string, call the ``.format()`` method on the translated +string that ``_()`` returns: + +.. code-block:: python + + hello = _("Hello {user}!").format(user=user.name) + + book_description = _("This is { book_title } by { author }").format( + book_title=book.title, author=author.name) + +To handle different plural and singular forms of a string, use ``ungettext()``: + +.. code-block:: python + + translated_string = ungettext( + "There is {count} {name} object.", + "There are {count} {name} objects.", + num_objects).format(count=count, name=name) + + +--------------------------------------------- +Internationalizing strings in JavaScript code +--------------------------------------------- + +.. todo:: + + +------------------------------------------------- +General guidelines for internationalizing strings +------------------------------------------------- + +Below are some guidelines to follow when marking your strings for translation. +These apply to strings in Jinja2 templates or in Python or JavaScript code. +These are mostly meant to make life easier for translators, and help to improve +the quality of CKAN's translations: + +* Leave as much HTML and other code out of the translation string as possible. + + For example, don't include surrounding ``

...

`` tags in the marked + string. These aren't necessary for the translator to do the translation, + and if the translator accidentally changes them in the translation string + the HTML will be broken. + + Good: + + .. code-block:: jinja + +

{% trans %}Don't put HTML tags inside translatable strings{% endtrans %}

+ + Bad (``

`` tags don't need to be in the translation string): + + .. code-block:: python + + mystring = _("

Don't put HTML tags inside translatable strings

") + +* But don't split a string into separate strings. + + Translators need as much context as possible to translate strings well, and + if you split a string up into separate strings and mark each for translation + separately, translators must translate each of these separate strings in + isolation. Also, some languages may need to change the order of words in a + sentence or even change the order of sentences in a paragraph, splitting + into separate strings makes assumptions about word order. + + It's better to leave HTML tags or other code in strings than to split a + string. For example, it's often best to leave HTML ```` tags in rather + than split a string. + + Good: + + .. code-block:: python + + _("Don't split a string containing some markup into separate strings.") + + Bad (text will be difficult to translate or untranslatable): + + .. code-block:: python + + _("Don't split a string containing some ") + "" + _("markup") + + _("into separate strings.") + +* You can split long strings over multiple lines using parentheses to avoid + long lines, Python will concatenate them into a single string: + + Good: + + .. code-block:: python + + _("This is a really long string that would just make this line far too " + "long to fit in the window") + +* Leave unnecessary whitespace out of translatable strings, but do put + punctuation into translatable strings. + +* Try not to make translators translate strings that don't need to be + translated. + + For example, ``'legacy_templates'`` is the name of a directory, it doesn't + need to be marked for translation. + +* Mark singular and plural forms of strings correctly. + + In Jinja2 templates this means using ``{% trans %}`` and ``{% pluralize %}`` + or ``ungettext()``. In Python it means using ``ungettext()``. See above + for examples. + + Singular and plural forms work differently in different languages. + For example English has singular and plural nouns, but Slovenian has + singular, dual and plural. + + Good: + + .. code-block:: python + + num_people = 4 + translated_string = ungettext( + 'There is one person here', + 'There are {num_people} people here', + num_people).format(num_people=num_people) + + Bad (this assumes that all languages have the same plural forms as English): + + .. code-block:: python + + if num_people == 1: + translated_string = _('There is one person here') + else: + translated_string = _( + 'There are {num_people} people here'.format(num_people=num_people)) + +* Don't use `old-style %s string formatting `_ + in Python, use the new `.format() method`_ + instead. + + Strings formatted with ``.format()`` give translators more context. + The ``.format()`` method is also more expressive, and is the preferred way + to format strings in Python 3. + + Good: + + .. code-block:: python + + "Welcome to {site_title}".format(site_title=site_title) + + Bad (not enough context for translators): + + .. code-block:: python + + "Welcome to %s" % site_title + +* Use descriptive names for replacement fields in strings. + + This gives translators more context. + + Good: + + .. code-block:: python + + "Welcome to {site_title}".format(site_title=site_title) + + Bad (not enough context for translators): + + .. code-block:: python + + "Welcome to {0}".format(site_title) + + Worse (doesn't work in Python 2.6): + + .. code-block:: python + + "Welcome to {}".format(site_title) + +* Use ``TRANSLATORS:`` comments to provide extra context for translators + for difficult to find, very short, or obscure strings. + + For example, in Python: + + .. code-block:: python + + # TRANSLATORS: This is a helpful comment. + _("This is an ambiguous string") + + In Jinja2: + + .. code-block:: jinja + + {# TRANSLATORS: This heading is displayed on the user's profile page. #} +

{% trans %}Heading{% endtrans %}

+ + These comments end up in the ``ckan.pot`` file and translators will see them + when they're translating the strings (Transifex shows them, for example). + + .. note:: + + In both Python and Jinja2, the comment must be on the line before the line + with the ``_()``, ``ungettext()`` or ``{% trans %}``, and must start with + the exact string ``TRANSLATORS:`` (in upper-case and with the colon). + This string is configured in ``setup.cfg``. + + .. todo:: + + Example of leaving a translator comment in JavaScript. + Probably ``// TRANSLATORS: This is a helpful comment`` will work. + +.. todo:: + + Explain how to use *message contexts*, where the same exact string may + appear in two different places in the UI but have different meanings. + + For example "filter" can be a noun or a verb in English, and may need two + different translations in another language. Currently if the string + ``_("filter")`` appears in different places in CKAN this will only + produce one string to be translated in the ``ckan.pot`` file. + + I think the right way to handle this with gettext is using ``msgctxt``, + but it looks like babel doesn't support it yet. + +.. todo:: + + Explain how we internationalize dates, currencies and numbers + (e.g. different positioning and separators used for decimal points in + different languages). + +.. _.format() method: https://docs.python.org/2/library/stdtypes.html#str.format diff --git a/doc/extensions/best-practices.rst b/doc/extensions/best-practices.rst index abe565adf0c..a90f5044a23 100644 --- a/doc/extensions/best-practices.rst +++ b/doc/extensions/best-practices.rst @@ -48,3 +48,11 @@ of the extension, to avoid conflicting with core config settings or with config settings from other extensions. For example:: ckan.my_extension.show_most_popular_groups = True + + +------------------------------------- +Internationalize user-visible strings +------------------------------------- + +All user-visible strings should be internationalized, see +:doc:`/contributing/string-i18n`. diff --git a/doc/theming/best-practices.rst b/doc/theming/best-practices.rst index e38627ce144..6ff17d5c044 100644 --- a/doc/theming/best-practices.rst +++ b/doc/theming/best-practices.rst @@ -27,12 +27,12 @@ like ``
``. Links created with changes in a new version of CKAN, or if a plugin changes the URL routing. -------------------------------- -Use ``_()`` and ``ungettext()`` -------------------------------- +--------------------------------------------------------------------- +Use ``{% trans %}``, ``{% pluralize %}``, ``_()`` and ``ungettext()`` +--------------------------------------------------------------------- -Always use :py:func:`_` (or, if pluralizaton is needed, :py:func:`ungettext`) -to mark user-visible strings for localization. +All user-visible strings should be internationalized, see +:doc:`/contributing/string-i18n`. ----------------------------------------------------------------- diff --git a/doc/theming/javascript.rst b/doc/theming/javascript.rst index 3c0c5f05885..04a2a0eb277 100644 --- a/doc/theming/javascript.rst +++ b/doc/theming/javascript.rst @@ -28,6 +28,11 @@ to themes. * `jQuery.com `_, including the `jQuery Learning Center `_ +.. seealso:: + + :doc:`/contributing/string-i18n` + How to mark strings for translation in your JavaScript code. + -------- Overview diff --git a/doc/theming/templates.rst b/doc/theming/templates.rst index f5c600114eb..747253c45e7 100644 --- a/doc/theming/templates.rst +++ b/doc/theming/templates.rst @@ -8,6 +8,11 @@ CKAN pages are generated from Jinja2_ template files. This tutorial will walk you through the process of writing your own template files to modify and replace the default ones, and change the layout and content of CKAN pages. +.. seealso:: + + :doc:`/contributing/string-i18n` + How to mark strings for translation in your template files. + ------------------------- Creating a CKAN extension