-
-
Notifications
You must be signed in to change notification settings - Fork 958
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow to delay/limit the number of GitHub repo checks and Pull requests in time #7690
Comments
Seems like your Celery workers were not running, and the queued tasks are now being processed. I don't think this is a scenario we should support. In normal operation, rate limiting is not needed as commits don't happen often enough to hit any limits. |
This issue looks more like a support question than an issue. We strive to answer these reasonably fast, but purchasing the support subscription is not only more responsible and faster for your business but also makes Weblate stronger. In case your question is already answered, making a donation is the right way to say thank you! |
You could nbe right with the celery workers ... now they run but simply too fast ... also as said they are sometime trying evers 25seconds ... in my eyes this should be slowed down in general ... with many repositories and an active transkation community t could be possible to run into rate limits also in normal cases :-) |
We're not running into GitHub limits on Hosted Weblate with thousands of projects, but indeed it could possibly happen. |
This issue has been put aside. It is currently unclear if it will ever be implemented as it seems to cover too narrow of a use case or doesn't seem to fit into Weblate. Please try to clarify the use case or consider proposing something more generic to make it useful to more users. |
We also did not the last 2 years :-) now we had that issue and no way to fix it beside:
We see that the same repo is hammered every 5-20 seconds ... this makes the rate limit not better :-) and it "blocks the weblate repo" but still tries to push because it can (it thinks at least). Why it do not detectes the rate limit response and change the retry strategy? In fact completely up to you what you do ... I can just say: I have a real life issue and try to discuss/bring up ideas on how to prevent such issues for others - even if they are rare ones that "should" never happen. |
There is no reason why a single repo should get a pull request that often. See https://docs.weblate.org/en/latest/admin/continuous.html#lazy-commit for info when Weblate commits changes to Git (what in default configuration triggers pushing to upstream repository).
In case the queue is long, there should be an exclamation mark in top navigation for all superusers, or it can be seen in the performance view. See also https://docs.weblate.org/en/latest/admin/install.html#monitoring-weblate, https://docs.weblate.org/en/latest/faq.html#how-can-i-check-whether-my-weblate-is-set-up-properly, https://docs.weblate.org/en/latest/admin/install.html#monitoring-celery-status |
The issue here is that the repo operation fails (due to the rate limit), so it will be retried after about 25 seconds. Note: I'm working on the same issue together with @Apollon77 |
@nijel > there should be an exclamation mark in top navigation for all superusers, or it can be seen in the performance view. Hm ... Me as admin I'm not in the tool daily ... I would have expected an email ... Here you see sentry about that ... since yesterday we have 12.000 (!!) times this error for roughly 80 repositories ... and most of the time in the last 24h weblate was offline to restore the rate limits hopefully somehow -so let it be only for 5h in that timeframe |
The e-mails are being sent using Celery as well, so it would be a bit unpractical to try to notify about non-working Celery using e-mail. The best approach is to add Weblate metrics to whatever monitoring you are using. There is metrics API endpoint exposing all important info - most important to look for is Anyway, back to the original topic - there had to be a huge queue of tasks which caused this. There is no retrying in failed pushes, it just tried again when commit is triggered. Also, it's not caused by Weblate being offline, but by part of Weblate being alive and part not. |
Seems we also have this problem: https://hosted.weblate.org/projects/jasp/jaspcircular-qml/#alerts |
@shun2wang Does it happen regularly? I've just manually pushed the repo (you could have done that as well). |
@nijel Yes, not sure the regularity of this, but we've had this problem many times. this will clear the translated characters on Weblate. here we using a github workflow to update translations. this work is done automatically. EDIT: I just learned that my colleague of JASP has contacted you, thank you |
That action can always lose translations – you don't force Weblate to push changes before trying to merge. So, there is always the possibility that Weblate has pending changes which were not committed yet (see https://docs.weblate.org/en/latest/admin/continuous.html#lazy-commit). To be on the safe side, invoke |
Back to the original topic – the problem is that GitHub doesn't tell any information on what is triggering this:
I will ask their support for more info. |
Okay, there is nothing better than trying and slowing down if we hit this:
|
The problem accured again. This time a single string was changed which caused updates in app 100 components. Weblate tried to create pull requests for all of them (nearly) at once hitting the github limit again. According to github (https://docs.github.com/en/rest/guides/best-practices-for-integrators?apiVersion=2022-11-28#dealing-with-secondary-rate-limits) there should be a delay between 2 github requests of at least 1s. So please consider to simply add a possibility to configure / add some delay between two github requests. |
Share the code among classes to have consistent error handling and to reuse the code. This will allow to add more features to shared code later (rate limiting as described in #7690).
Perform API requests with lock held to ensure we do not perform more of them at one time. Issue #7690
This should reduce number of issues with GitHub secondary rate limiting and make Weblate behave nicer to the services. Fixes #7690
Thanks a lot for the improvement, we are looking forward to version 4.15.1! |
I'm curious if it really addresses the issue in all cases, or some additional modifications will be needed. Feedback is welcome. |
Thanks very much for the change. If the issue still occures one improvement could be to delay retries significantly. As weblate normally does not update in realtime it could be a good idea to delay retries AFTER errors for several minutes or longer to avoid triggering any ratelimits due to (failing) retries. But this is not prior - at the moment we will simply evaluate whether the problem raises again. |
I think still have problems on 4.15.1-dev see here. |
This way we better handle sleeping in concurrent contexts. Fixes #7690
Otherwise the next consumer might not get the updated timestamp. Fixes #7690
Okay, I will try some more tweaking. |
Describe the problem
We have an on premise Weblate installation which is connected to more then 80 different GitHub repositories because we have a very decentralized approach of plugins for a smart home system. They are configured as described in https://docs.weblate.org/en/latest/vcs.html?highlight=github#github-pull-requests
Today after a reboot of the Weblate Container we get many emails from Weblate telling us that
or later too
It seems to me that there was something hanging and so weblate did not updated repos over time and so many many were outdated on start. Now he started doing the updates and pull requests one after the other (when checking process list felt like partially 2 in parallel?)
Also: Repos that got such an error seems to be retried every 20 seconds! (at least from what Weblate shows me in "klast seen" on the warnings page
Describe the solution you'd like
It would be great to have an option to delay such "mass updates" to be able to match them with the GitHub rate limuts ... e.g. one per 5 mins or something like that.
Additionally there should be a configurable "delay after one repo got an error before it is retried" ... seems we have some repos that are tried near to once every 20seconds!
Describe alternatives you've considered
None so far, because none came into our mind ... We also need to find out what actually happened that made it "stop processing the updates "in general ...
Screenshots
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: