Skip to content

toyama0919/embulk-filter-google_translate_api

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
src
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Google Translate Api filter plugin for Embulk

Google Translate Api filter plugin for Embulk.

see support language. Google Language Codes - tomihasa

Overview

  • Plugin type: filter

Configuration

  • key_names: target key names (array, required)
  • out_key_name_suffix: translated target key names suffix (string, required)
  • source_lang: source language (string, default: null)
  • target_lang: target language (string, required)
  • model: nmt(neural machine translation) or base. if not define use nmt. (string, default: null)
  • sleep: delay per record, define milliseconds. (integer, default: 0)
  • google_api_key: google_api_key. support environment variable. please export GOOGLE_API_KEY(string, default: null)

Example

input

- {
    sentence1: 'Embulk supports plugins to add functions',
    sentence2: 'Embulk is a parallel bulk data loader that helps data transfer between various storages, databases, NoSQL and cloud services.',
    sentence3: 'You can share the plugins to keep your custom scripts readable, maintainable, and reusable.',
    json_column: ['aaa', 'bbbb', 'cccc']
  }
- {
    sentence1: 'Automatic guessing of input file formats',
    sentence2: 'Parallel & distributed execution to deal with big data sets',
    json_column: ['aaa', 'bbbb', 'cccc']
  }

setting

filters:
  - type: google_translate_api
    key_names:
     - sentence1
     - sentence2
     - sentence3
    out_key_name_suffix: _translated
    source_lang: en
    target_lang: ja
    sleep: 1000
    google_api_key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

output

*************************** 1 ***************************
           sentence1 (string) : Embulk supports plugins to add functions
           sentence2 (string) : Embulk is a parallel bulk data loader that helps data transfer between various storages, databases, NoSQL and cloud services.
           sentence3 (string) : You can share the plugins to keep your custom scripts readable, maintainable, and reusable.
         json_column (  json) : ["aaa","bbbb","cccc"]
sentence1_translated (string) : Embulkは、機能を追加するためのプラグインをサポートしています
sentence2_translated (string) : Embulkは、さまざまなストレージ、データベース、NoSQLのとクラウドサービス間のデータ転送を助けるパラレル・バルク・データ・ローダーです。
sentence3_translated (string) : あなたは、読み込み可能な保守性、および再利用可能なカスタムスクリプトを維持するためのプラグインを共有することができます。
*************************** 2 ***************************
           sentence1 (string) : Automatic guessing of input file formats
           sentence2 (string) : Parallel & distributed execution to deal with big data sets
           sentence3 (string) :
         json_column (  json) : ["aaa","bbbb","cccc"]
sentence1_translated (string) : 入力ファイル形式の自動推測
sentence2_translated (string) : ビッグデータ・セットに対処するための並列分散実行
sentence3_translated (string) :
embulk preview -G -b embulk_bundle -I  tmp/test_translate.yml.liquid  10.86s user 0.68s system 115% cpu 9.991 total

Example(Multi language combined)

input

- {
    sentence1: 'Embulk is a Java application.',
    sentence2: 'Embulk ist eine Java-Anwendung.',
    sentence3: 'Embulk是Java应用程序。',
    json_column: ['aaa', 'bbbb', 'cccc']
  }

setting

filters:
  - type: google_translate_api
    key_names:
     - sentence1
     - sentence2
     - sentence3
    out_key_name_suffix: _translated
    target_lang: ja
    sleep: 1000
    google_api_key: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
  • If not define source_lang, auto detect language.

output

*************************** 1 ***************************
           sentence1 (string) : Embulk is a Java application.
           sentence2 (string) : Embulk ist eine Java-Anwendung.
           sentence3 (string) : Embulk是Java应用程序。
         json_column (  json) : ["aaa","bbbb","cccc"]
sentence1_translated (string) : Embulkは、Javaアプリケーションです。
sentence2_translated (string) : Embulkは、Javaアプリケーションです。
sentence3_translated (string) : Embulkは、Javaアプリケーションです。

Build

$ ./gradlew gem  # -t to watch change of files and rebuild continuously

About

Google Translate Api filter plugin for Embulk.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published