-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Interval of update loop varies and can slow down on slow backend #53
Comments
It could also be that performing the update interval after the previous one finished is the desired behaviour. In that case this one can be closed (and I'll make a separate issue for the http metrics part). |
I agree with your view on the matter and noticed this working on VRP expiry stuff. That whole refresh/VRP expiry piece (in #15) needs to be broken out to accomplish this and test it properly. So I should be able to address this as part of that work. When I push them, we should discuss the default timer values. |
Previously if you had a very slow backend, the refresh timer for a reload would only start after the current refresh has finished. Now the timer will run after the timer fires for the last one. This helps avoid the client being torpedod by very slow backends Tag: #53
I split two of the subpoints into their own tickets, since they are worth their own investigations for now. But the update loop now happens consistently, even if the backend is slow. And VRP+SLURM updates are done in parallel |
As stayrtr operator I want stayrtr to keep fetching updates if the backend system is slow or not responsive.
If I want updates every 10 minutes, and a update takes 5 minutes, I want the next update to run 10 minutes after the previous one started. Not 15 minutes after (10 minutes after the previous finished).
Context
When running stayrtr from a slow connection (4G was not cooperating) I noticed that the update loop does not have a set interval but has a set delay. If the response of SLURM or the JSON are slow the loop takes (much) longer.
Root cause
Handling slow responses is a hard problem. It ends up being a tradeoff between liveliness of the whole system or getting all information.
For example, in my rpki-client wrapped I found that some repositories were so slow that they prevented me from updating on time. I decided to add a utility to timeout/abort fetching from slow repos. There I decided finishing an update was more important than having all information.
Desired behaviour
first of all:
RefreshStatusCode
etc could be tracked from the http util.then:
The text was updated successfully, but these errors were encountered: