Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Py-googletrans doesn't have a proper built-in translation text length limit #49

Closed
kumaranjeya opened this issue Aug 1, 2019 · 5 comments

Comments

@kumaranjeya
Copy link

kumaranjeya commented Aug 1, 2019

Input is a subtitles file.

Translating text from "zh-cn" to "en".
Translation: 100% |####################################################################################| Time:  0:00:01
Error: Translation failed.
@kumaranjeya
Copy link
Author

Already tried with other languages like French, Portuguese and Spanish which all working except Chinese Simplified giving error.

@BingLingGroup
Copy link
Owner

BingLingGroup commented Aug 1, 2019

Translation failed message only happened when

if not translated_text or len(translated_text) != len(text_list):

If you don't mind, could you upload a failed subtitles file for me to test it out? It will be faster for me to figure out what is happening. I guess it's another bug.

@BingLingGroup
Copy link
Owner

BingLingGroup commented Aug 2, 2019

Already tried with other languages like French, Portuguese and Spanish which all working except Chinese Simplified giving error.

I test it out. It seems the py-googletrans doesn't handle the case that a single translation text is too long. Though my program judge the length, it's still too long for the text containing full-wide char.

To be specific, at the beginning, I want to reduce as many translation requests as possible. So I combine multiple lines of subtitles text to a single big text per translation. Then I find the text length limit of a single request. According to py-googletrans, it has a limit of 15k. To be conservative and according to my common sense about the translate.google.com's 5000 text length limit, I set the size limit to 4000.

But somehow it's still too big for the text containing full-wide char. And seems weirder that after setting it to 2000 for full-wide char, it's still not that enough. So I set it to 1000 and it finally works. Now the program will judge whether a text has a full-wide char. If so, its size will count as four times as its length.

It may result in a slower translation procedure. If you want the translation faster, you can manually control the sleep time between two translation requests by input -slp option.

@BingLingGroup
Copy link
Owner

Commit f0b0ec3 should fix this issue. Thanks for your feedback.

@kumaranjeya
Copy link
Author

kumaranjeya commented Aug 3, 2019 via email

@BingLingGroup BingLingGroup changed the title Translating text from "zh-cn" to "en" failed error Py-googletrans doesn't have built-in translation text length limit Aug 3, 2019
@BingLingGroup BingLingGroup changed the title Py-googletrans doesn't have built-in translation text length limit Py-googletrans doesn't have a proper built-in translation text length limit Aug 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants