Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weblate scanning has no signs of progress #7250

Closed
2 tasks done
pontaoski opened this issue Feb 14, 2022 · 6 comments
Closed
2 tasks done

Weblate scanning has no signs of progress #7250

pontaoski opened this issue Feb 14, 2022 · 6 comments
Labels
wontfix Nobody will work on this.

Comments

@pontaoski
Copy link

Describe the issue

When importing large repositories into Weblate, the scanning period takes a huge time without any indication of what exactly it's doing or how long it'll take.

I already tried

  • I've read and searched the documentation.
  • I've searched for similar issues in this repository.

Steps to reproduce the behavior

  1. Go to create component
  2. Import from VCS
  3. Input the URL of a large repository (the one I'm importing is about 150k files)
  4. Press continue
  5. It hangs; no signs of life to the user

Expected behavior

  1. Go to create component
  2. Import from VCS
  3. Import URL of a large repository
  4. Press continue
  5. It tells me what it's doing and how long it think it will take

Screenshots

No response

Exception traceback

No response

How do you run Weblate?

PyPI module

Weblate versions

  • Weblate: 4.10.1
  • Django: 4.0.2
  • siphashc: 2.1
  • translate-toolkit: 3.5.3
  • lxml: 4.7.0
  • Pillow: 8.4.0
  • bleach: 4.1.0
  • python-dateutil: 2.8.2
  • social-auth-core: 4.1.0
  • social-auth-app-django: 5.0.0
  • django-crispy-forms: 1.13.0
  • oauthlib: 3.2.0
  • django-compressor: 3.1
  • djangorestframework: 3.13.1
  • django-filter: 21.1
  • django-appconf: 1.0.5
  • user-agents: 2.2.0
  • filelock: 3.4.2
  • setuptools: 59.6.0
  • jellyfish: 0.8.9
  • openpyxl: 3.0.9
  • celery: 5.2.3
  • kombu: 5.2.3
  • translation-finder: 2.11
  • weblate-language-data: 2022.1
  • html2text: 2020.1.16
  • pycairo: 1.20.1
  • pygobject: 3.42.0
  • diff-match-patch: 20200713
  • requests: 2.26.0
  • django-redis: 5.2.0
  • hiredis: 2.0.0
  • sentry_sdk: 1.5.4
  • Cython: 0.29.27
  • misaka: 2.1.1
  • GitPython: 3.1.26
  • borgbackup: 1.1.17
  • pyparsing: 3.0.7
  • pyahocorasick: 1.4.2
  • python-redis-lock: 3.7.0
  • Python: 3.10.2
  • Git: 2.35.1
  • psycopg2-binary: 2.9.3
  • phply: 1.2.5
  • chardet: 4.0.0
  • ruamel.yaml: 0.17.20
  • tesserocr: 2.5.2
  • boto3: 1.20.49
  • zeep: 4.1.0
  • aeidon: 1.10.1
  • iniparse: 0.5
  • Mercurial: 6.0.2
  • git-svn: 2.35.1
  • git-review: 2.2.0
  • Redis server: 6.2.6
  • PostgreSQL server: 14.1
  • Database backends: django.db.backends.postgresql
  • Cache backends: default:RedisCache, avatar:FileBasedCache
  • Email setup: django.core.mail.backends.smtp.EmailBackend: smtp.sendgrid.net
  • OS encoding: filesystem=utf-8, default=utf-8
  • Celery: redis://localhost:6379, redis://localhost:6379, regular
  • Platform: Linux 5.17.0-0.rc0.20220112gitdaadb3bd0e8d.63.fc36.x86_64 (x86_64)

Weblate deploy checks

No response

Additional context

No response

@tomkolp
Copy link
Contributor

tomkolp commented Feb 14, 2022

I always use the console and docker logs during import. There is information about the progress of file processing. Unfortunately, I do not always have access to this console remotely.

@nijel
Copy link
Member

nijel commented Feb 14, 2022

The component creation has log visible in the application. The repository scanning merely consists of git clone...

@nijel
Copy link
Member

nijel commented Feb 14, 2022

Related to #7251

@nijel
Copy link
Member

nijel commented Feb 14, 2022

To figure out what is really the expensive operation, you can try it without Weblate:

  1. Get default branch (unless you specify it): git ls-remote --symref repo:url HEAD
  2. Clone the repository, git clone --depth 1 --branch repo:branch repo:url repo:destination
  3. Find translation files using translation-finder: translation-finder repo:destination

But with ~150k files, my guess would be as well that the translation-finder is the bottleneck here and #7251 could address this.

@nijel
Copy link
Member

nijel commented Feb 14, 2022

I've looked at the translation-finder and there is a lot of space to improve the performance there. WeblateOrg/translation-finder@510ef7a should remove ~300k syscalls in your case.

nijel added a commit to WeblateOrg/translation-finder that referenced this issue Feb 14, 2022
This performs way better due to single matchin per string.

See WeblateOrg/weblate#7250
@github-actions
Copy link

This issue has been automatically marked as stale because there wasn’t any recent activity.

It will be closed soon if no further action occurs.

Thank you for your contributions!

@github-actions github-actions bot added the wontfix Nobody will work on this. label Feb 25, 2022
@github-actions github-actions bot closed this as completed Mar 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix Nobody will work on this.
Projects
None yet
Development

No branches or pull requests

3 participants