xgettext cannot extract strings from templates #622

Open
igungor opened this Issue Oct 31, 2012 · 11 comments

Comments

Projects
None yet
5 participants

igungor commented Oct 31, 2012

Consider this example:

<input type="text" value={{ _("search term") }} />

I can extract the _() wrapped string with xgettext properly but when i translate and use tornado.locale.load_gettext_translations(), I get the translation of "search", not "search term".

PO file is like below:

: templates/index.html:1

msgid "search term"
msgstr "terim ara"

But i only see "terim" in the rendered page, not "terim ara". The problem here is the original string is the value of an attribute and didn't escaped by double quotes, thus the returned string from load_gettext_translations() is not escaped and contains whitespace.

The returned string from the translated text should be quoted by default, otherwise all we get is the first word before the whitespace.

Owner

bdarnell commented Oct 31, 2012

You mean you have <tag attr={{_('string')}}> instead of ? You're supposed to put the quotes in the html; the template system never provides them for you (why does it work for "search term" but not "terim ara", since both have spaces?)

igungor commented Nov 7, 2012

If you put the quotes in the html for "input" tag, then its "value" attribute key's value can not be extracted. Try this example:

"""
$ cat index.html
<input type="text"value="{{ _('search term') }}" name="q2" />

$ xgettext -L Python --keyword=_ index.html -d example -o example.pot

$ ls example.pot
ls: a.pot: No such file or directory
"""

If you put quotes around your i18n wrapper, you can't extract it via xgettext. If you don't, you can extract the strings and "example.pot" template file is created but tornado can't get the translated strings properly from example.mo file as I wrote on the first comment. Tornado can't substitute the original string with the translated string as intended. Only the first word of the translation is substituted.

Owner

bdarnell commented Nov 7, 2012

What's in index.html? That didn't come through in your example.

igungor commented Nov 8, 2012

Sorry, markdown beat my html. here:

Owner

bdarnell commented Nov 8, 2012

OK, so the problem is in how you're invoking xgettext. You're telling xgettext it's python, and parsed according to the rules of python the _() call appears in a string literal so it is correctly ignoring it. We need to either teach xgettext about tornado template syntax, or maybe just run the template compiler as a preprocessor. I'm not sure what sort of extensibility xgettext has so I'm not sure how to add this.

kinsen commented Jan 29, 2014

i have same problem,too! Did you solved it ?

igungor commented Jan 30, 2014

@kinsen sadly no.

bdarnell added the template label Jul 16, 2014

Contributor

st4lk commented Jan 16, 2015

+1, also have same problem

Contributor

st4lk commented Feb 1, 2015

Here is how i've solved it.
First of all, tornado template execute python code, so we simply can do:

{{ u'"{0}"'.format(_('search term')) }}

This will wrap with double quotes the output of _('search term')

It is ok for one time solution, but if there are many such strings, we can define special translate function, that will wrap the string with quotes:

class BaseHandler(tornado.web.RequestHandler):
    def get_template_namespace(self):
        ns = super(BaseHandler, self).get_template_namespace()
        ns['_Q'] = lambda *x: u'"{0}"'.format(ns['_'](*x))
        return ns

Now in template just use this _Q:

{{ _Q('search term') }}

And don't forget to add keywords into xgettext invocation, so it will find our new function:

xgettext [..old options] --keyword=_Q --keyword=_Q:1,2

fordguo commented Dec 28, 2015

+1, what's the best way?

Owner

bdarnell commented Dec 31, 2015

There are a couple of options. The simplest thing to implement would be a preprocessor that generates the python code from a template and writes it to disk, so you can run xgettext on the template compiler's output. However, this might be awkward to use.

Alternately, xgettext could be extended to recognize the Tornado template syntax. It doesn't look like xgettext has a plugin system, though, so this could require changes to xgettext itself. If you're willing to use babel instead of xgettext, it's easier: babel is written in python and has a plugin architecture, so we should be able to provide a plugin to extract strings from templates (by generating the code and passing it to babel's python extraction plugin.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment