-
-
Notifications
You must be signed in to change notification settings - Fork 958
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perform_push task locks repository for a long time #9128
Comments
If you want more fine-grained insight into performance, I suggest using Sentry – it collects timing of most of the operations (command executions, SQL queries, cache accesses). Looking at our Sentry performance data, it turns out that alert handling might be problematic here with many linked components. Is that what you are using? I've created #9139 to cover this. |
This heavily reduces number of update queries. See WeblateOrg#9128
Are you using GitHub pull requests integration (or for other code hosting service)? In that case, lowering VCS_API_DELAY might improve the situation. |
Thanks for your answers. I am not using linked components and the VCS integration is with a local MS DevOps installation. I have enabled Sentry integration. However, I did not set Does the sentry integration also log the performance of git invocations? I am curious to learn those statistics. |
It traces the performance of all operations, see https://docs.sentry.io/product/performance/ for their docs. |
This heavily reduces number of update queries. See #9128
Maybe tweaking PostgreSQL configuration will help? See https://docs.weblate.org/en/latest/admin/install/docker.html#configuring-postgresql-server |
I'll try to see if I can find out whether something is locking the table. When I run the same kind of UPDATE query while the lock is held, it also takes more than a minute to execute. Outside those times, just a few ms. I use a separate postgresql server, but it is also in a Docker container. |
I used some queries from https://wiki.postgresql.org/wiki/Lock_Monitoring to find the source of the locks. These are the processes at the moment when the timeouts occur: The query from the link above give 7526 as the blocked pid and 7524 is the blocking pid. I suspect that this is a classical race condition: there are two kinds of locks: the Weblate lock in Redis and the table lock in PostgreSQL. The Celery task holds the Redis lock and needs the PG lock while another proces has the PG lock and wants the Redis lock. It seems that that other process might be another I hope you can find a solution. |
Thanks for detailed analysis, I think now see where the issue is. |
Thank you for your report; the issue you have reported has just been fixed.
|
Děkuji! |
Describe the issue
I update the units in my docker hosted weblate installation via the API by uploading partial gettext files. Sometimes, these requests fail due to an timeout.
I've extracted a part of the logs from around the time this happens:
The
perform_push
task is created at16:46:27,952
.The
do_update
part is executed and acquires the lock at16:46:27,981
.The repository is up to date at
16:46:28,342
.However, only at
16:48:28,103
the lock is released, 2 minutes later.It is immediately locked again, I assume for the
push_repo
part of the task.It is hard to debug this problem because the individual steps in
component.do_update
are not logged. Perhaps debug level logging could be added to this method and the methods it calls. Also, I think it would be beneficial to add debug level logging to theexecute()
method ofvcs/base
, in order to understand which commands are executed and how long they took to execute.Thank you for your consideration.
I already tried
Steps to reproduce the behavior
Invoking the
/file/
endpoint of a single component multiple times in with small updates and short time in between seems to trigger this problem.Expected behavior
The repository lock should not be held for two minutes.
Screenshots
No response
Exception traceback
No response
How do you run Weblate?
Docker container
Weblate versions
weblate@3d2e89b9ce7d:/$ weblate list_versions
Weblate deploy checks
Additional context
No response
The text was updated successfully, but these errors were encountered: