# 国际化翻译(gettext)

我们写app希望可以适应本地化需求,也就是当换一种语言的时候可以自动转成翻译好的对应文本.我们当然可以每个语言些一个版本,代码相同只是修改其中的文本.

一个简单的解决方案是使用一个函数包裹字符串,让函数负责找到对应翻译.比如


In [1]:
%%writefile international.py
#coding:utf-8
spanishStrings = {'Hello world!': 'Hola Mundo!'}
frenchStrings = {'Hello world!': 'Bonjour le monde!'}
germanStrings = {'Hello world!': 'Hallo Welt!'}
  

Overwriting international.py


In [2]:
from international import *
def trans(s):
    if LANGUAGE == 'English':
        return s
    if LANGUAGE == 'Spanish':
        return spanishStrings.get(s)
    if LANGUAGE == 'French':
        return frenchStrings.get(s)
    if LANGUAGE == 'German':
        return germanStrings.get(s)

In [3]:
LANGUAGE = 'French'
print(trans("Hello world!"))

Bonjour le monde!


但是很明显,一旦文本量变大了就会无法管理了~

Python提供了gettext模块用于解决这类问题


## gettext的使用

> 创建国际化文档的文件夹目录


    ----|
        |-src-|
              |-locale-|
                       |-en-|
                       |    |-LC_MESSAGES
                       |
                       |-cn-|
                       |    |-LC_MESSAGES
                       |
                       |-fr-|
                            |-LC_MESSAGES
        

> gettext初始化

使用脚本工具`pygettext`初始化gettext设置(如果安装的python中没有的话可以来[这里下载](./pygettext.py))

In [4]:
!pygettext.py 

In [5]:
!cat messages.pot

# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR ORGANIZATION
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"POT-Creation-Date: 2016-01-06 15:41+CST\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=CHARSET\n"
"Content-Transfer-Encoding: ENCODING\n"
"Generated-By: pygettext.py 1.5\n"




我们修改它的

"Content-Type: text/plain; charset=CHARSET\n"
"Content-Transfer-Encoding: ENCODING\n"

两个字段

In [6]:
%%writefile messages.pot
# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR ORGANIZATION
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"POT-Creation-Date: 2016-01-06 10:05+CST\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=gb2312\n"
"Content-Transfer-Encoding: utf-8\n"
"Generated-By: pygettext.py 1.5\n"

Overwriting messages.pot


接着我们将它保存为lang.po

In [7]:
!rename.py messages.pot lang.po

done!


> 注册国际化文本

In [8]:
%%writefile transfer.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import gettext
langen = gettext.translation('lang', './locale', languages=['en'])
langcn = gettext.translation('lang', './locale', languages=['cn'])
langfr = gettext.translation('lang', './locale', languages=['fr'])


Overwriting transfer.py


其中:

+ `gettext_te.py`是要翻译模块或app名
+ `./locale`是存放翻译文件的路径,
+ `languages`参数指定要使用的语言存放的子目录,这里cn表示使用`./locale/cn/LC_MESSAGES/`路径下的翻译文件.

这样我们就有了一个`_()`方法来翻译文本

> 编辑之前的lang.po

In [9]:
%%writefile lang.po
# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR ORGANIZATION
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"POT-Creation-Date: 2016-01-06 10:05+CST\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: utf-8\n"
"Generated-By: pygettext.py 1.5\n"

msgid "Hello world!"
msgstr "世界你好!"

msgid "Python is a good Language."
msgstr "Python是门好语言."

Overwriting lang.po


> 生成mo文件

In [10]:
!msgfmt.py lang.po

之后将生成的mo文件放入`./locale/cn/LC_MESSAGES/`下

In [11]:
!cp lang.mo locale/cn/LC_MESSAGES/lang.mo

In [12]:
!rm lang.mo

In [13]:
!rm lang.po

再编辑另外两个文件

In [14]:
!pygettext.py

In [15]:
%%writefile messages.pot
# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR ORGANIZATION
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"POT-Creation-Date: 2016-01-06 10:05+CST\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=IBM037\n"
"Content-Transfer-Encoding: utf-8\n"
"Generated-By: pygettext.py 1.5\n"

Overwriting messages.pot


In [16]:
!rename.py messages.pot lang.po

done!


In [17]:
%%writefile lang.po
# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR ORGANIZATION
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"POT-Creation-Date: 2016-01-06 10:05+CST\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=IBM037\n"
"Content-Transfer-Encoding: utf-8\n"
"Generated-By: pygettext.py 1.5\n"



Overwriting lang.po


In [18]:
!msgfmt.py lang.po

In [19]:
!cp lang.mo locale/en/LC_MESSAGES/lang.mo

In [20]:
!rm lang.mo

In [21]:
!rm lang.po

In [22]:
!pygettext.py

In [23]:
%%writefile messages.pot
# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR ORGANIZATION
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"POT-Creation-Date: 2016-01-06 10:05+CST\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=IBM01147\n"
"Content-Transfer-Encoding: utf-8\n"
"Generated-By: pygettext.py 1.5\n"

Overwriting messages.pot


In [24]:
!rename.py messages.pot lang.po

done!


In [25]:
%%writefile lang.po
# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR ORGANIZATION
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"POT-Creation-Date: 2016-01-06 10:05+CST\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: utf-8\n"
"Generated-By: pygettext.py 1.5\n"

msgid "Hello world!"
msgstr "Bonjour le Monde!"

msgid "Python is a good language."
msgstr "Python est une bien langue."

Overwriting lang.po


In [26]:
!msgfmt.py lang.po

In [27]:
!cp lang.mo locale/fr/LC_MESSAGES/lang.mo

In [28]:
!rm lang.mo

In [29]:
!rm lang.po

> 编辑主模块

In [30]:
%%writefile gettext_te.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import print_function
from transfer import *
langcn.install()
print(_("Hello world!"))
langen.install()
print(_("Hello world!"))
langfr.install()
print(_("Hello world!"))

Overwriting gettext_te.py


In [31]:
%run gettext_te.py

世界你好!
Hello world!
Bonjour le Monde!


这样每次只要修改对应文件夹的`mo`文件就可以实现本地化了,一次受罪终身受用~

## 用format方法处理带变量字符串

当遇到要有变量的字符串时我们当然可以直接分段的翻译,但明显这样不好用不好看,可以利用字符串的format方法优雅的翻译

(请先将kernel重启)

In [1]:
%%writefile lang.po
# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR ORGANIZATION
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"POT-Creation-Date: 2016-01-06 10:05+CST\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: utf-8\n"
"Generated-By: pygettext.py 1.5\n"

msgid "Hello world!"
msgstr "Bonjour le Monde!"

msgid "Python is a good language."
msgstr "Python est une bien langue."

msgid "Hello"
msgstr "Bonjour"

msgid "Hello {name:}!"
msgstr "Bonjour {name:}!"

Writing lang.po


In [2]:
!msgfmt.py lang.po

In [3]:
!cp lang.mo locale/fr/LC_MESSAGES/lang.mo

In [4]:
!rm lang.po

In [5]:
!rm lang.mo

In [6]:
%%writefile lang.po
# SOME DESCRIPTIVE TITLE.
# Copyright (C) YEAR ORGANIZATION
# FIRST AUTHOR <EMAIL@ADDRESS>, YEAR.
#
msgid ""
msgstr ""
"Project-Id-Version: PACKAGE VERSION\n"
"POT-Creation-Date: 2016-01-06 10:05+CST\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language-Team: LANGUAGE <LL@li.org>\n"
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: utf-8\n"
"Generated-By: pygettext.py 1.5\n"

msgid "Hello world!"
msgstr "世界你好!"

msgid "Python is a good Language."
msgstr "Python是门好语言."

msgid "Hello"
msgstr "你好"

msgid "Hello {name:}!"
msgstr "你好{name:}!"

Writing lang.po


In [7]:
!msgfmt.py lang.po

In [8]:
!cp lang.mo locale/cn/LC_MESSAGES/lang.mo

In [9]:
!rm lang.po

In [10]:
!rm lang.mo

In [11]:
%%writefile gettext_te2.py
#!/usr/bin/env python
# -*- coding: utf-8 -*-
from __future__ import print_function
from transfer import *
langcn.install()
print(_("Hello world!"))
print(_("Hello"))
print(_("Hello {name:}!").format(name="Lily"))
langen.install()
print(_("Hello world!"))
print(_("Hello"))
print(_("Hello {name:}!").format(name="Lily"))
langfr.install()
print(_("Hello world!"))
print(_("Hello"))
print(_("Hello {name:}!").format(name="Lily"))

Overwriting gettext_te2.py


In [12]:
%run gettext_te2.py

世界你好!
你好
你好Lily!
Hello world!
Hello
Hello Lily!
Bonjour le Monde!
Bonjour
Bonjour Lily!
