Makes localizing or translating your web app as simple as print _('Hello World')
Brian McConnell email@example.com
A cloud based translation and localization utility for Python which combines human and machine translation.
This is a simple but very useful utility that automates the process of localizing a web application or translating dynamic content. It is as simple to use as gettext, but there are no translation files or prompt databases to worry about.
See a helpful how-to article that explains how to mimic this functionality in your development environment or framework of choice: https://github.com/myGengo/avalon/blob/master/howto.md
Note: global variables are defined in config.py to facilitate cross-module usage
sl = 'en'
tl = 'es'
gengo_public_key = 'foo'
gengo_private_key = 'bar'
google_apikey = 'foobar'
translation_order = ['gengo', 'google']
print _('Hello World!')
print _('Hello World!', tm='gengo', tier='pro')
print _('Hello World!', tm='google')
- Runs on Google App Engine (Python) out of the box
- Caches translations for any number of strings into any number of languages (uses App Engine memcache)
- Requests translations for new strings from popular machine translation engines
- Requests translations from human translators at http://www.gengo.com (starting at 5 cents per word)
- No local database, file system or configuration hassles, all of your apps and content translations are synced as they update
- Just say print _('Hello World!', ....) and go
This utility treats translation resources (translation memory, machine translation and human translation APIs) as services that can be queried whenever new texts are encountered. The utility can be somewhat slow when rendering a page with many new source texts, so there are a number of things you can do to improve performance, including:
- Always use caching (e.g. memcached on App Engine)
- If possible, implement a persistent data store within your system (e.g. App Engine data store, MySQL, etc) to further reduce the need for API calls
- If possible, when calling human translation services, use an asynchronous callback to receive completed translations, rather than polling
- Consider grouping translations in collections, for example to retrieve all translations for a specific resource or URL to a specific target language in a single request
- Spider frequently loaded pages to refresh the cache, both to speed performance, and to pick up new translations or post-edits
- When translating long-form content, segment it at the page or paragraph/div level (don't segment by word or by sentence)
That said, this tool is very easy to use, and reduces or eliminates the need to deal with static translation files (e.g. PO files), that are both a hassle to manage, and quickly become out of sync with the source material. If you're interested in extending the utility, just start your own fork and let us know.
Extending This Utility
This module was initially developed as a proof of concept on App Engine. It would be great if developers extended this to support other environments (AWS, Django, etc) and added connectors to additional translation memories (such as TAUS).
Developers should observe the following guidelines when adding to this utility:
- Email firstname.lastname@example.org to give us a heads up, or to ask questions about the utility
- Check local cache or memcached first before calling network resources (disable utility if cache is not available)
- Allow the user to adjust the order in which translation services are called
Building A Version Of This For Your Development Environment
I recently wrote a how-to article for O'Reilly Media as a companion to this library. The article describes the basic design pattern and how to mimic this approach in your preferred language or framework. The idea is simple enough, and when done right, swats a bunch of flies in one go. If you decide to implement this in Java, Ruby, Erlang or whatever, be sure to let us know.
Questions, suggestions, email email@example.com