Skip to content

Transipy is your one-stop solution for lightning-fast and accurate document translation. With its parallel processing capabilities, Transipy effortlessly handles large volumes of data in various formats, including CSV, TXT,

License

Notifications You must be signed in to change notification settings

NeiH4207/transipy

Repository files navigation

Transipy

Downloads PyPI MIT License LinkedIn


Logo

Transipy: The Powerful and Fast Document Translation Tool

Transipy is your one-stop solution for lightning-fast and accurate document translation. With its parallel processing capabilities, Transipy effortlessly handles large volumes of data in various formats, including CSV, TXT, DOCX, and XLSX.
Explore the docs »

View Demo · Report Bug · Request Feature

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Bugs
  5. Contributing
  6. License
  7. Contact

About The Project

Transipy is your one-stop solution for lightning-fast and accurate document translation. With its parallel processing capabilities, Transipy effortlessly handles large volumes of data in various formats, including CSV, TXT, DOCX, and XLSX.

Key Features

  • Fastest Speed: Experience the fastest document translation available, thanks to Transipy's parallel processing techniques.
  • Versatile Format Support: Seamlessly translate your documents in CSV, TXT, DOCX, and XLSX formats, eliminating the need for manual conversions.
  • High Accuracy: Trust Transipy's powerful translation engine to deliver precise results, ensuring your message is conveyed accurately across languages.

Transform your document translation workflow with Transipy – the powerful, fast, and versatile solution you've been waiting for.

Built With

Getting Started

To get a local copy up and running follow these simple steps.

Installation

Install the required packages using the following command:

pip install transipy

Try a sample translation:

transipy -f examples/sample.csv -s en -t vi

You can also install from the git repository:

git clone git@github.com:NeiH4207/transipy.git
cd transipy
pip install -e .

Language and ISO-639 code

Language ISO-639 Code Language ISO-639 Code Language ISO-639 Code
Afrikaans af Albanian sq Amharic am
Arabic ar Armenian hy Azerbaijani az
Basque eu Belarusian be Bengali bn
Bosnian bs Bulgarian bg Catalan ca
Cebuano ceb Chichewa ny Chinese (Simplified) zh-CN
Chinese (Traditional) zh-TW Corsican co Croatian hr
Czech cs Danish da Dutch nl
English en Esperanto eo Estonian et
Filipino fil Finnish fi French fr
Frisian fy Galician gl Georgian ka
German de Greek el Gujarati gu
Haitian Creole ht Hausa ha Hawaiian haw
Hebrew he Hindi hi Hmong hmn
Hungarian hu Icelandic is Igbo ig
Indonesian id Irish ga Italian it
Japanese ja Javanese jv Kannada kn
Kazakh kk Khmer km Korean ko
Kurdish (Kurmanji) ku Kyrgyz ky Lao lo
Latin la Latvian lv Lithuanian lt
Luxembourgish lb Macedonian mk Malagasy mg
Malay ms Malayalam ml Maltese mt
Maori mi Marathi mr Mongolian mn
Myanmar (Burmese) my Nepali ne Norwegian no
Odia (Oriya) or Pashto ps Persian fa
Polish pl Portuguese pt Punjabi pa
Romanian ro Russian ru Samoan sm
Scots Gaelic gd Serbian sr Sesotho st
Shona sn Sindhi sd Sinhala si
Slovak sk Slovenian sl Somali so
Spanish es Sundanese su Swahili sw
Swedish sv Tajik tg Tamil ta
Tatar tt Telugu te Thai th
Turkish tr Turkmen tk Ukrainian uk
Urdu ur Uyghur ug Uzbek uz
Vietnamese vi Welsh cy Xhosa xh
Yiddish yi Yoruba yo Zulu zu

Usage

usage: transipy [-h] -f FILE_PATH [-l SEP] -s SOURCE -t TARGET [-c CHUNK_SIZE] [-o OUTPUT_FILE] [-d DICTIONARY] [--column COLUMN]
                [--skip SKIP] [--sheet SHEET]

Translate text in a file (.csv/.txt/.docx/.xlsx) from source language to target language.

options:
  -h, --help            show this help message and exit
  -f FILE_PATH, --file-path FILE_PATH
                        The source file path
  -l SEP, --sep SEP     The separator of the file [comma, tab, space,...]
  -s SOURCE, --source SOURCE
                        Source language (e.g. en, vi)
  -t TARGET, --target TARGET
                        Target language (e.g. en, vi)
  -c CHUNK_SIZE, --chunk-size CHUNK_SIZE
                        The chunk size for splitting the translation process
  -o OUTPUT_FILE, --output-file OUTPUT_FILE
                        The output file path
  -d DICTIONARY, --dictionary DICTIONARY
                        The dictionary file path, used for custom translation
  --column COLUMN       The column name to translate, separated by comma
  --skip SKIP           The column name to skip, separated by comma
  --sheet SHEET         The sheet name to translate, separated by comma

Translate a CSV file

Example:

transipy -f path_to_file.[csv, tsv, txt, xlsx, docx] -s <source> -t <target>

Translate a file with a dictionary

The dictionary file is a JSON file that contains the translation of the words. The dictionary file should be in the following format (see examples/dictionary.json):

{
    "word_1": "translated_word_1",
    "word_2": "translated_word_2"
}

Example: You have a dictionary file named dictionary.json and you want to translate specific columns ("Title" and "Summary") from a CSV file from English to Vietnamese. You can use the following command:

transipy -f path_to_file.csv -s en -t vi -d path_to/dictionary.json --column Title,Summary

Example input file:

| Title            | Summary                   | Level 1 | Level 2        | Level 3        | Level 4 |
| ---------------- | ------------------------- | ------- | -------------- | -------------- | ------- |
| Stomach Cancer   | Likelihood of Development | lower   | slightly lower | slightly higher| higher  |
| Colorectal Cancer| Likelihood of Development | lower   | slightly lower | slightly higher| higher  |
| Thyroid Cancer   | Likelihood of Development | lower   | slightly lower | slightly higher| higher  |
| Lung Cancer      | Likelihood of Development | lower   | slightly lower | slightly higher| higher  |
| Liver Cancer     | Likelihood of Development | lower   | slightly lower | slightly higher| higher  |

Example output file:

| Title              | Summary                  | Level 1 | Level 2        | Level 3        | Level 4 |
| ------------------ | ------------------------ | ------- | -------------- | -------------- | ------- |
| Ung thư dạ dày     | Khả năng phát triển      | lower   | slightly lower | slightly higher| higher  |
| Ung thư đại trực   | Khả năng phát triển      | lower   | slightly lower | slightly higher| higher  |
| Ung thư tuyến giáp | Khả năng phát triển      | lower   | slightly lower | slightly higher| higher  |
| Ung thư phổi       | Khả năng phát triển      | lower   | slightly lower | slightly higher| higher  |
| Ung thư gan        | Khả năng phát triển      | lower   | slightly lower | slightly higher| higher  |

Bugs

  • Error: invalid syntax. Perhaps you forgot a comma? - This error appears due to a bug in the current Google Translate version. The problem occurs when the text contains certain words (for example, "nullified") that will cause this error.
  • Error: HTTPSConnectionPool(host='translate.googleapis.com', port=443): Max retries exceeded with url - This error appears due to the limitation of the Google Translate API. The solution is to increase the -c chunk_size parameter to reduce the number of requests to the API in a short time.

Contributing

Contributions are what make the open-source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Vũ Quốc Hiển - @hienvq23 - hienvq23@gmail.com

Project Link: https://github.com/Neih4207/transipy

About

Transipy is your one-stop solution for lightning-fast and accurate document translation. With its parallel processing capabilities, Transipy effortlessly handles large volumes of data in various formats, including CSV, TXT,

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Languages