New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The docker image on Docker Hub for version 1.2.7 (latest) has broken auto language detect #217
Comments
I can't try but I recommend that in some 'production' sort of environment you do your posts through an error handle-able module. My guess is that the character is throwing off the language auto detection. Personally I applied the following changes to the LibreTranslate app to 'language.py'. The error should be seen in the host console.
I have borrowed the solution from someone else who was having the same issue long ago, sadly I didn't store a reference to the post, anyhow. Nonetheless, this solution is still working just fine, after long periods of testing. |
If above fixes the issue - I'm willing to make a PR (which'd obviously link the solution original poster). |
Thank you Cristei, The issue however affects all text. It does not matter if the input is just "Ciao" or "Hello". Detection will still fail. Also, we have exactly the same issue when using the suggested python library from the libretranslate page, to perform the call. The issue appeared in the latest release. However the previous version with tag "main" turns out to also show the same behaviour. I am guessing this is because of some change in the language models that are downloaded, as I believe the "main" Docker image itself (not just "latest") is unchanged but still has changed its behaviour since just a week ago. We have worked around the issue by using another language detection engine and thus specify the source and target languages specifically when using libretranslate. We are of course still hoping the actual problem will be fixed at some point as most users will probably be affected. Best regards, |
I forgot that I haven't synced my repository with the upstream, and the issue you were facing is very similar to what I've been myself, hence the response. My apologies. |
No worries! I appreciate you looking into this anyhow! |
I'm sure this is related to this issue. When running the LibreTranslate docker instance locally, French text copied from random website:
Strangely, the libretranslate.org instance reports the language correctly as "fr".
This applies to both "latest" and "main" tags. |
Yes, Ali's experience is identical to ours.
/Mats
Hämta Outlook för Android<https://aka.ms/AAb9ysg>
…________________________________
From: Ali Sherief ***@***.***>
Sent: Thursday, March 3, 2022 11:45:03 AM
To: LibreTranslate/LibreTranslate ***@***.***>
Cc: Mats Bjerin ***@***.***>; Author ***@***.***>
Subject: Re: [LibreTranslate/LibreTranslate] The docker image on Docker Hub for version 1.2.7 (latest) has broken auto language detect (Issue #217)
I'm sure this is related to this issue.
When running the LibreTranslate docker instance locally, /detect totally fails to detect the language.
French text copied from random website:
curl -X POST "http://localhost:5000/detect" -H "accept: application/json" -H "Content-Type: application/x-www-form-urlencoded" -d "q=Voil%C3%A0%20pourquoi%2C%20je%20n'ach%C3%A9terais%20pas%20d'Asics%20d'occasion%20o%C3%B9%20m%C3%AAme%20neuve%20sauf%20pour%20ceux%20qui%20veulent%20jeter%20leur%20argent%20par%20la%20fen%C3%AAtre" -i
HTTP/1.1 200 OK
Access-Control-Allow-Credentials: true
Access-Control-Allow-Headers: Authorization, Content-Type
Access-Control-Allow-Methods: GET, POST
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: Authorization
Access-Control-Max-Age: 1728000
Content-Length: 37
Content-Type: application/json
Date: Thu, 03 Mar 2022 10:29:14 GMT
Server: waitress
[{"confidence":0.0,"language":"en"}]
Strangely, the libretranslate.org instance reports the language correctly as "fr".
This applies to both "latest" and "main" tags.
—
Reply to this email directly, view it on GitHub<#217 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ATZHR5FJPKOGECST7MO6UNLU6CJ27ANCNFSM5PARUKQQ>.
Triage notifications on the go with GitHub Mobile for iOS<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Also, it looks like the (to reproduce - create a virtualenv and install all of the stuff inside requirements.txt - you may have to explicitly install ctranslate2 to resolve dependency issues)
Arabic text from https://polyglot.readthedocs.io/en/latest/Detection.html Even the French text I called LibreTranslate with returns the correct output when calling polyglot's Detector directly:
This indicates that the Detector is being passed additional parameters that are messing with the Detector reliability. This hypothesis is backed by these warnings on the console printed by LibreTranslate daemon when calling
Note: This warning does not always appear during |
The failures have occurred because After inserting a pdb breakpoint inside
Now we just have to figure out why the list is empty. |
Apparently, argostranslate (this is definitely a bug, someone should edit the label on this issue) |
Should be fixed by #219 |
To replicate
Start with: "docker-compose up -d"
Then run this:
curl -X POST -H "Content-Type: application/json" -d '{"q": "Ciao!", "source": "auto", "target": "en"}' localhost:5000/translate
The language auto detect will fail and no translation is made.
The text was updated successfully, but these errors were encountered: