In [1]:
from back_translators import *

In [2]:
# you should use the proper lang code as dst_lang, which may differ across translators.
# if you are uncertain, simply type the full name, hints/correction will be given 
# if that is wrong for common languages, feel free to use the full name!

# Baidu back translator
# the ids and keys are deleted for privacy
appid = '---------'
secretKey = '---------'
BBT = BaiduBackTranslator(appid, secretKey, 'en') # 'en' is the right code for english

# Google back translator
GBT = GoogleBackTranslator('English') # but 'English' also works!

# Papago back translator
clid = '---------'
clkey = '---------'
PBT = PapagoBackTranslator(clid, clkey, 'eNgLiSH') # even 'eNgLiSH' works!

queries = ['I am very happy today!', 
        'Back translation can work well as a means of text augmentation', 
        'This is the third example, \n and it has two lines']

[32mReturning lang code [en] for "English" (english) [0m
[32mReturning lang code [en] for "eNgLiSH" (English) [0m


## Translatation

In [3]:
# translate one query into Chinese 
print('Baidu Transalte: ', BBT.translate(queries[0], 'en', 'zh')) # you can try to given "Chinese" here 
print('Google Transalte: ', GBT.translate(queries[0], 'en', 'zh-cn'))
print('Papago Transalte: ', PBT.translate(queries[0], 'en', 'zh-CN'))

Baidu Transalte:  我今天很开心！
Google Transalte:  今天我很高兴！
Papago Transalte:  我今天很幸福！


In [4]:
# translate a list of queries into Chinese. Use 'bulk_transalte'!
print('Baidu Transalte: \n')
results = BBT.bulk_transalte(queries, 'en', 'zh')
for i, res in enumerate(results):
    print(f'This is translated example {i+1}: ', res)


print('\nGoogle Transalte: \n')
results = GBT.bulk_transalte(queries, 'en', 'zh-cn')
for i, res in enumerate(results):
    print(f'This is translated example {i+1}: ', res)
    

print('\nPapago Transalte: \n')
results = PBT.bulk_transalte(queries, 'en', 'zh-CN')
for i, res in enumerate(results):
    print(f'This is translated example {i+1}: ', res)

Baidu Transalte: 

This is translated example 1:  我今天很开心！
This is translated example 2:  回译可以很好地作为文本扩充的一种手段
This is translated example 3:  这是第三个例子,，
它有两条线

Google Transalte: 

This is translated example 1:  今天我很高兴！
This is translated example 2:  后退转换可以用作文本增强的手段
This is translated example 3:  这是第三个例子，
 它有两条线

Papago Transalte: 

This is translated example 1:  我今天很幸福！
This is translated example 2:  作为文本增强手段，可以很好地进行后翻译。
This is translated example 3:  这是第三个例子。 
 有两条线。


## Back Translatation

### A simple example

In [5]:
# src_lang and dst_lang are not neccessary if they are same with the dst_lang as in initialization 
# mid_lang can be one lang, or a list of langs. If not given, it will be totally random. 

print('Baidu: ', BBT.back_translate(queries[1], src_lang='en', mid_lang='kor', dst_lang='en'))
print('Google: ', GBT.back_translate(queries[1], mid_lang='arabic'))
print('Papago: ', PBT.back_translate(queries[1]))

Baidu:  Translation can be used as a means to expand the text
[32mReturning lang code [ar] for "arabic" (arabic) [0m
Google:  The background translation can work well as a text zoom method
Papago:  Reverse translation is an effective way to enlarge text.


### To track the back translation process

In [6]:
import pprint
pp = pprint.PrettyPrinter(sort_dicts=False)


# let's set mid_lang random and a list this time
# if all_mid_lang is true, then back translation will go through all the mid_lang 
print('Baidu:')
pp.pprint(BBT.back_translate(queries[0], out_dict=True))

print('\nGoogle:')
pp.pprint(GBT.back_translate(queries[0], mid_lang=['french', 'japanese', 'Spanish', 'Vietnamese', 
                                                'uyghur', 'turkish', 'chinese (traditional)'], 
                                     all_mid_lang=True, out_dict=True))

print('\nPapago:')
results = PBT.back_translate(queries[0], mid_lang=['thai', 'Indonesia'], out_dict=True)
for idx, res in enumerate(results):
    print(f'Back translation {idx}: ')
    pp.pprint(res)

Baidu:
{'srcLang': 'english',
 'originText': 'I am very happy today!',
 'transLang1': 'danish',
 'transText1': 'Jeg er meget glad i dag!',
 'dstLang': 'english',
 'finalText': "I'm very happy today!"}

Google:
[32mReturning lang code [fr] for "french" (french) [0m
[32mReturning lang code [ja] for "japanese" (japanese) [0m
[32mReturning lang code [es] for "Spanish" (spanish) [0m
[32mReturning lang code [vi] for "Vietnamese" (vietnamese) [0m
[32mReturning lang code [ug] for "uyghur" (uyghur) [0m
[32mReturning lang code [tr] for "turkish" (turkish) [0m
[32mReturning lang code [zh-tw] for "chinese (traditional)" (chinese (traditional)) [0m
{'srcLang': 'english',
 'originText': 'I am very happy today!',
 'transLang1': 'french',
 'transText1': "Je suis très content aujourd'hui!",
 'transLang2': 'japanese',
 'transText2': '私は今日とても幸せです！',
 'transLang3': 'spanish',
 'transText3': '¡Estoy muy feliz hoy!',
 'transLang4': 'vietnamese',
 'transText4': 'Hôm nay tôi rất vui!',
 'transLa

## Back translate a list of queiry 

In [7]:
# The usage is same above, but here is just a simple example
print('Baidu:')
results = BBT.bulk_back_translate(queries, mid_lang='cantonese')
for idx, res in enumerate(results):
    print(f'Back transalted text {idx+1}: ', res)

print('\nGoogle:')
results = GBT.bulk_back_translate(queries, mid_lang=['bulgarian', 'armenian'])
for idx, res in enumerate(results):
    print(f'Google transalted text {idx+1}: ', res)

    
print('\nPapago:')
results = PBT.bulk_back_translate(queries)
for idx, res in enumerate(results):
    print(f'Papago transalted text {idx+1}: ', res)

Baidu:
[32mReturning lang code [yue] for "cantonese" (cantonese) [0m
[32mReturning lang code [yue] for "cantonese" (cantonese) [0m
[32mReturning lang code [yue] for "cantonese" (cantonese) [0m
Back transalted text 1:  I'm very happy today!
Back transalted text 2:  He used a method that can be well used as a means of text expansion
Back transalted text 3:  This is the third example,,
He has two lines

Google:
[32mReturning lang code [bg] for "bulgarian" (bulgarian) [0m
[32mReturning lang code [hy] for "armenian" (armenian) [0m
[32mReturning lang code [bg] for "bulgarian" (bulgarian) [0m
[32mReturning lang code [hy] for "armenian" (armenian) [0m
[32mReturning lang code [bg] for "bulgarian" (bulgarian) [0m
[32mReturning lang code [hy] for "armenian" (armenian) [0m
Google transalted text 1:  ['I am very happy today!', 'I am very happy today.']
Google transalted text 2:  ['Reverse translation can work well as a means of increasing the text', 'Returning translation can work

## Back translation as text augmentation

In [8]:
# you can set mid_lang as a str or a list of str, or do nothing 
# set `out_per_text` (int,defalut=1) to define the max number of augmented texts 
# if out_per_text>1, from the second iteration, the mid_lang will be randomly reset.
# if the mid_lang is a list, then a list of same size will be randomly chosen. 

# you can also input a str as query, there is no separate bulk_augment

print('Baidu:\n')
results = BBT.augment(queries,  out_per_text=1)
for idx, res in enumerate(results):
    print(f'Baidu augmented text {idx+1}: ', res)


print('\nGoogle:\n')
results = GBT.augment(queries, mid_lang=['zh-cn', 'fr'], out_per_text=2)
for idx, res in enumerate(results):
    print(f'Google augmented text {idx+1}: ', res)
    
    
print('\nPapago:\n')
results = PBT.augment(queries, mid_lang='ko', out_per_text=3)
for idx, res in enumerate(results):
    print(f'Papago augmented text {idx+1}: ', res)

Baidu:

Baidu augmented text 1:  ['I am very happy today.']
Baidu augmented text 2:  ['Translation is a good way to expand text']
Baidu augmented text 3:  ['This is the third example\nThere are two lines']

Google:

Google augmented text 1:  ["I'm so happy today."]
Google augmented text 2:  ['DOS conversion can be used as a means of improving the text', 'The new translator can make the same as an increase in the increase in the fact that']
Google augmented text 3:  ['This is a second example,\n and have two lines', 'This is the third example.\n He has two lines']

Papago:

Papago augmented text 1:  ['I had a great time today!', "I'm very happy today!"]
Papago augmented text 2:  ['As a means of text enhancement, post-translation is possible.', 'Reverse translation can work well as a means of strengthening text.', 'White translation can work well as a means of expanding text.']
Papago augmented text 3:  ["Here's the third example. \n And there are two lines.", 'This is the third example,

## Problem

Not all languages in the lang_dict provided for a translator are mutally translatable using API. So please make sure that you only include mutally translatable languages in the lang_dic or only set mutally translatable as mid_lang when you want to use back transaltion to augment texts

In [9]:
PBT.translate('私は今日とても幸せです！', 'ja', 'es')

[32mCannot translate "私は今日とても幸せです！" from ja to es[0m
Reason being:  'message'


## Notes

- Google Translate appears to be the most stable one in terms of the mutual translatablity among supported languages. 
- Baidu Translate is much more afforable, but you need to have a Chinese domestic phone number in oder to apply for access to its translation API. 


### Disclaimer 

- The methods provided here for accessing Google Translate without applying for access to its API are for illustration purposes. If you are to use Google Transalate to do massive back transaltions, please apply for its API as that is the most reliable and ethical use of its could translation service. 
    - You can apply for Google Translate API at: https://cloud.google.com/translate/. 
    - Quickstarts: https://cloud.google.com/translate/docs/quickstarts. 
    - Tutorial: https://codelabs.developers.google.com/codelabs/cloud-translation-python3#0. 