Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translation from English to Catalan hallucinates often #312

Closed
jordimas opened this issue Dec 19, 2022 · 2 comments
Closed

Translation from English to Catalan hallucinates often #312

jordimas opened this issue Dec 19, 2022 · 2 comments

Comments

@jordimas
Copy link

jordimas commented Dec 19, 2022

How to reproduce, clean installation doing:

  • pip install argostranslate
  • argos-translate --from en --to ca < text.txt

See a sample below. The test "© 2019 Màrqueting digital de BookingSuite" is not present in the English sample.

The models trained at Softcatalà with the same corpus do not show this problem.

English text

"A new policy now prohibits linking to other social media in any way, including (but not limited to!) fediverse / #Mastodon URLS. I will not participate in an explicitly walled garden.

help.twitter.com/en/rules-and-...

Those who stay are putting themselves at risk, and I tried to sound the warning for weeks. I'm done. It's nicer here"

Catalan text

"Una nova política prohibeix ara la vinculació amb altres xarxes socials de qualsevol manera, incloent (però no limitat a!) fediverse / #Mastodon URLS. No participaré en un jardí de paret explícita.

© 2019 Màrqueting digital de BookingSuite. BookingSuite és una marca de Booking.com. Estaràs en bones mans: pots confiar en el gran servei d'Atenció al client que proporciona Booking.com

Els que s'allotgen es posen en risc, i he intentat sonar l'avís durant setmanes. He acabat. És més bonic

@splaGit
Copy link

splaGit commented Dec 19, 2022

Other sample:

English text

"It makes me incredibly sad to see how much R has had to take on because I’m so ill, and how slow support has been to get to us because of the neglected nature of my illness by the health and medical establishment. And of course, there’s the guilt. So much guilt.

Christmas in our house is usually a low key affair, just the two of us enjoying gifts under the tree, a meal, and lots of TV and games. But I just don’t want to make any more work for R, and making Christmas happen is work. 3/"

Catalan text

"Em fa molt trist veure quant R ha hagut de prendre perquè estic tan malalt, i com el suport lent ha estat arribar a nosaltres per la naturalesa descuidada de la meva malaltia per la salut i l'establiment mèdic. I per descomptat, hi ha la culpa. Molta culpa.

Aquest lloc web utilitza cookies per millorar la vostra experiència. Podeu desactivar-lo si ho voleu. Accepto Més informació Però no vull fer més feina per R, i fer que el Nadal passi * és feina. 3/"

Last paragraph says something that is not appearing in original english text:

"This website uses cookies to improve your experience. You can disable it if you want it. I accept More information But I do not want to do more work for R, and make Christmas happen * is work."

@splaGit
Copy link

splaGit commented Dec 19, 2022

And other one:

English text

"Does it pay to pay? A comparison of the benefits of open-access publishing across various sub-fields in Biology | bioRxiv"

Catalan text

"Paga? BioRxiv"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants