Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic translation using Weblate Translation Memory is very slow #4992

Closed
tomkolp opened this issue Dec 7, 2020 · 10 comments
Closed

Automatic translation using Weblate Translation Memory is very slow #4992

tomkolp opened this issue Dec 7, 2020 · 10 comments
Labels
question This is more a question for the support than an issue.

Comments

@tomkolp
Copy link
Contributor

tomkolp commented Dec 7, 2020

Describe the bug

Automatic translation using Weblate Translation Memory is very slow. Component has 107,374 strings and 379,094 words. The imported memory has 120,000 strings. Translation using Weblate Translation Memory reached less than 30% after 48 hours.

To Reproduce the bug

Description should look similar to this:

Steps to reproduce the behavior:

  1. Go to 'tools->Automatic translation'
  2. Select: Add as translations
  3. Select: Not translated strings
  4. Select: Machine translation-> Weblate Translation Memory
  5. Set" Score threshold: 100
  6. Click on: Apply

Expected behavior

I expected faster speed.

Server configuration and status

Weblate installation: Docker

  • Weblate: 4.3.2
  • Django: 3.1.3
  • siphashc: 2.1
  • Whoosh: 2.7.4
  • translate-toolkit: 3.2.0
  • lxml: 4.6.1
  • Pillow: 8.0.1
  • bleach: 3.2.1
  • python-dateutil: 2.8.1
  • social-auth-core: 3.3.3
  • social-auth-app-django: 4.0.0
  • django-crispy-forms: 1.9.2
  • oauthlib: 3.1.0
  • django-compressor: 2.4
  • djangorestframework: 3.12.1
  • django-filter: 2.4.0
  • django-appconf: 1.0.4
  • user-agents: 2.2.0
  • filelock: 3.0.12
  • setuptools: 40.8.0
  • jellyfish: 0.8.2
  • openpyxl: 3.0.5
  • celery: 4.4.7
  • kombu: 4.6.11
  • translation-finder: 2.5
  • weblate-language-data: 2020.11
  • html2text: 2020.1.16
  • pycairo: 1.16.2
  • pygobject: 3.30.4
  • diff-match-patch: 20200713
  • requests: 2.24.0
  • django-redis: 4.12.1
  • hiredis: 1.1.0
  • sentry_sdk: 0.19.2
  • Cython: 0.29.21
  • misaka: 2.1.1
  • GitPython: 3.1.11
  • borgbackup: 1.1.14
  • pyparsing: 2.4.7
  • Python: 3.7.3
  • Git: 2.20.1
  • psycopg2: 2.8.6
  • psycopg2-binary: 2.8.6
  • phply: 1.2.5
  • chardet: 3.0.4
  • ruamel.yaml: 0.16.12
  • tesserocr: 2.5.1
  • akismet: 1.1
  • boto3: 1.16.12
  • zeep: 4.0.0
  • aeidon: 1.7.0
  • iniparse: 0.5
  • mysqlclient: 2.0.1
  • Mercurial: 5.6
  • git-svn: 2.20.1
  • git-review: 1.28.0
  • Redis server: 6.0.9
  • PostgreSQL server: 13.1
  • Database backends: django.db.backends.postgresql
  • Cache backends: default:RedisCache, avatar:FileBasedCache
  • Email setup: django.core.mail.backends.smtp.EmailBackend: xxxxxxxxxxxx.eu
  • OS encoding: filesystem=utf-8, default=utf-8
  • Celery: redis://cache:6379/1, redis://cache:6379/1, regular
  • Platform: Linux 5.4.0-54-generic (x86_64)

Weblate deploy checks

System check identified some issues:

INFOS:
?: (weblate.I021) Error collection is not set up, it is highly recommended for production use
HINT: https://docs.weblate.org/en/weblate-4.3.2/admin/install.html#collecting-errors
?: (weblate.I028) Backups are not configured, it is highly recommended for production use
HINT: https://docs.weblate.org/en/weblate-4.3.2/admin/backup.html

System check identified 2 issues (1 silenced).

Exception traceback

weblate_1 | uwsgi stderr | [pid: 688|app: 0|req: 315/4214] 172.21.0.2 () {58 vars in 1060 bytes} [Mon Dec 7 18:46:13 2020] GET /js/task/fe15c95d-5103-45fc-87e0-fa2f90f4142d/ => generated 66 bytes in 89 msecs (HTTP/1.0 200) 9 headers in 622 bytes (2 switches on core 0)
weblate_1 | nginx stdout | 172.21.0.2 - - [07/Dec/2020:18:46:13 +0100] "GET /js/task/fe15c95d-5103-45fc-87e0-fa2f90f4142d/ HTTP/1.0" 200 66 "https://weblate.xxxxxxxx.eu/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"
weblate_1 | uwsgi stderr | [pid: 688|app: 0|req: 316/4215] 172.21.0.2 () {58 vars in 1060 bytes} [Mon Dec 7 18:46:14 2020] GET /js/task/fe15c95d-5103-45fc-87e0-fa2f90f4142d/ => generated 66 bytes in 62 msecs (HTTP/1.0 200) 9 headers in 622 bytes (2 switches on core 0)
weblate_1 | nginx stdout | 172.21.0.2 - - [07/Dec/2020:18:46:14 +0100] "GET /js/task/fe15c95d-5103-45fc-87e0-fa2f90f4142d/ HTTP/1.0" 200 66 "https://weblate.xxxxxxxx.eu/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.88 Safari/537.36"

Additional context

Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 46 bits physical, 48 bits virtual
CPU(s): 24
On-line CPU(s) list: 0-23
Thread(s) per core: 1
Core(s) per socket: 12
Socket(s): 2
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 62
Model name: Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
Stepping: 4
CPU MHz: 2593.774
BogoMIPS: 5187.54
Hypervisor vendor: Xen
Virtualization type: full
L1d cache: 768 KiB
L1i cache: 768 KiB
L2 cache: 6 MiB
L3 cache: 480 MiB
NUMA node0 CPU(s): 0-23

image

@nijel
Copy link
Member

nijel commented Dec 8, 2020

Is search on translation memory (while translating) slow for you as well? It can be caused by sub-optimal PostgreSQL configuration. You might want to start with https://pgtune.leopard.in.ua/ to get a reasonable configuration for your environment.

@tomkolp
Copy link
Contributor Author

tomkolp commented Dec 8, 2020

image
As for docker, these are my beginnings.
Edited the file /var/lib/docker/volumes/weblate-docker_postgres-data/_data/postgresql.conf, I don't see any improvement.

@nijel
Copy link
Member

nijel commented Dec 8, 2020

The configuration applies after restart, but I assume you did that.

Another thing which might help is trying to vacuum the memory_memory table: VACUUM FULL ANALYZE memory_memory;

@tomkolp
Copy link
Contributor Author

tomkolp commented Dec 8, 2020

A few hours have passed and it is much faster. The translation should now be completed in approximately 12 hours. Is weblate has built-in automatic periodic vacuuming base?
Thanks for the tips :)

@nijel
Copy link
Member

nijel commented Dec 8, 2020

You might want to tweak autovacuum configuration inside postgresql. In most cases the default configuration works well.

@nijel
Copy link
Member

nijel commented Dec 8, 2020

@ilocit Maybe you could try suggestion in #4992 (comment) as well, your case had similar problems as described here.

@ilocit
Copy link
Contributor

ilocit commented Dec 8, 2020

Thanks for the tipp @nijel. I will try this too.

@nijel nijel added the question This is more a question for the support than an issue. label Dec 13, 2020
@github-actions
Copy link

This issue looks more like a support question than an issue. We strive to answer these reasonably fast, but purchasing the support subscription is not only more responsible and faster for your business but also makes Weblate stronger. In case your question is already answered, making a donation is the right way to say thank you!

@tomkolp
Copy link
Contributor Author

tomkolp commented Dec 21, 2020

After all the advice and update to weblate 4.4 the speed is a bit better.

@tomkolp tomkolp closed this as completed Dec 21, 2020
@github-actions
Copy link

The issue you have reported is resolved now. If you don’t feel it’s right, please follow it’s labels to get a clue and take further steps.

  • In case you see a similar problem, please open a separate issue.
  • If you are happy with the outcome, don’t hesitate to support Weblate by making a donation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question This is more a question for the support than an issue.
Projects
None yet
Development

No branches or pull requests

3 participants