Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translation for HTML files can't be machine translated with "nottranslated" filter + "fuzzy" translations don't appear in downloaded translation if they still have "Needs editing" flag #9223

Closed
2 tasks done
jerzy-dudzic opened this issue May 12, 2023 · 2 comments
Assignees
Labels
bug Something is broken.
Milestone

Comments

@jerzy-dudzic
Copy link

Describe the issue

Hello,
A bit of a background of what I'm trying to achieve:
I'm uploading HTML files through the API(Polish language), then I'm creating multiple translations for those(English, Czech, Slovak etc).
Then:

  • for English - I want to run machine translations with "fuzzy" mode so that we have all the translations as soon as possible, but I also want them to be reviewed manually by someone that knows English - hence the "fuzzy" mode
  • for other languages - I want to run machine translations with "translate" mode - we accept those as final translations and these won't be reviewed. This works fine.

There are two problems with translating with "fuzzy" mode of HTML files:

Problem 1

Those texts from HTML files automatically inherit source language contents, therefore the only option to machine translate them, is to use todo filter. nottranslated filter doesn't trigger any translations because they appear as translated from the very beginning. They are "unfinished" so todo works, but they are not "untranslated". Using the todo filter in combination with fuzzy mode will effectively re-run machine translations over and over again(we trigger this by cron every night), until someone removes "Needs editing" flag from a given translation - resulting in extra costs.

Problem 2

When HTML files are machine translated with "fuzzy" mode and still have "Needs editing" flag, downloaded translation shows contents in the original(Polish) language whereas it should show translated(English) contents.

Both those problems doesn't appear with YAML translations.

Thank you very much for looking into this!
Jerzy

I already tried

  • I've read and searched the documentation.
  • I've searched for similar issues in this repository.

Steps to reproduce the behavior

Problem 1

  1. Create project -> "Add new translation component" -> select "Translate document", upload HTML file, select source language as "Polish"
    obraz
  2. After creating English translation:
    obraz
  3. After opening translation page for any English string:
    obraz

Problem 2

  1. Execute steps above and then run machine translations for English with Add as "Need editing" and "All strings" filter:
    obraz
  2. Download translation. It should show contents in English but shows contents in Polish instead:
    obraz

Expected behavior

Problem 1

When adding new translation for HTML component:

  • it should have empty strings for target language
  • it should show as "Untranslated"

Problem 2

When downloading a translation of HTML component:

  • if translated strings are marked as "Need translation", downloaded translation file should still include this translation

Above expectations are based on how this already work for YAML translations

Screenshots

No response

Exception traceback

No response

How do you run Weblate?

Docker container

Weblate versions

  • Weblate: 4.17
  • Django: 4.2
  • siphashc: 2.1
  • translate-toolkit: 3.8.6
  • lxml: 4.9.2
  • Pillow: 9.5.0
  • nh3: 0.2.11
  • python-dateutil: 2.8.2
  • social-auth-core: 4.4.2
  • social-auth-app-django: 5.2.0
  • django-crispy-forms: 2.0
  • oauthlib: 3.2.2
  • django-compressor: 4.3.1
  • djangorestframework: 3.14.0
  • django-filter: 23.1
  • django-appconf: 1.0.5
  • user-agents: 2.2.0
  • filelock: 3.11.0
  • rapidfuzz: 3.0.0
  • openpyxl: 3.1.2
  • celery: 5.2.7
  • django-celery-beat: 2.5.0
  • kombu: 5.2.4
  • translation-finder: 2.15
  • weblate-language-data: 2023.4
  • html2text: 2020.1.16
  • pycairo: 1.23.0
  • pygobject: 3.44.1
  • diff-match-patch: 20200713
  • requests: 2.28.2
  • django-redis: 5.2.0
  • hiredis: 2.2.2
  • sentry_sdk: 1.21.1
  • Cython: 0.29.34
  • misaka: 2.1.1
  • GitPython: 3.1.31
  • borgbackup: 1.2.4
  • pyparsing: 3.0.9
  • pyahocorasick: 2.0.0
  • python-redis-lock: 4.0.0
  • charset-normalizer: 3.1.0
  • Python: 3.11.3
  • Git: 2.30.2
  • psycopg2: 2.9.6
  • psycopg2-binary: 2.9.6
  • phply: 1.2.6
  • ruamel.yaml: 0.17.21
  • tesserocr: 2.6.0
  • boto3: 1.26.124
  • zeep: 4.2.1
  • aeidon: 1.12
  • iniparse: 0.5
  • mysqlclient: 2.1.1
  • Mercurial: 6.4.2
  • git-svn: 2.30.2
  • git-review: 2.3.1
  • Redis server: 6.2.12
  • PostgreSQL server: 15.3
  • Database backends: django.db.backends.postgresql
  • Cache backends: default:RedisCache, avatar:FileBasedCache
  • Email setup: django.core.mail.backends.smtp.EmailBackend: smtp.eu.mailgun.org
  • OS encoding: filesystem=utf-8, default=utf-8
  • Celery: redis://cache:6379/1, redis://cache:6379/1, regular
  • Platform: Linux 5.19.0-1024-aws (x86_64)

Weblate deploy checks

System check identified some issues:

INFOS:
?: (weblate.I021) Error collection is not set up, it is highly recommended for production use
	HINT: https://docs.weblate.org/en/weblate-4.17/admin/install.html#collecting-errors
?: (weblate.I028) Backups are not configured, it is highly recommended for production use
	HINT: https://docs.weblate.org/en/weblate-4.17/admin/backup.html

System check identified 2 issues (1 silenced).

Additional context

I previously reported a somewhat similar problem(with SRT files) so I'm adding a link to that - maybe it will help debug the issue: #8215

@nijel nijel added the bug Something is broken. label May 23, 2023
@nijel nijel self-assigned this May 23, 2023
@nijel nijel added this to the 4.18 milestone May 23, 2023
@nijel
Copy link
Member

nijel commented May 23, 2023

Problem 1 – this is intentional to have a working HTML document even if the strings are not translated.

Problem 2 – the fuzzy strings should be written. I will fix that.

@nijel nijel closed this as completed in c24c8e0 May 23, 2023
@github-actions
Copy link

Thank you for your report; the issue you have reported has just been fixed.

  • In case you see a problem with the fix, please comment on this issue.
  • In case you see a similar problem, please open a separate issue.
  • If you are happy with the outcome, don’t hesitate to support Weblate by making a donation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something is broken.
Projects
None yet
Development

No branches or pull requests

2 participants