-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[omdb] add basic support for activating background tasks #5615
[omdb] add basic support for activating background tasks #5615
Conversation
Created using spr 1.3.6-beta.1
Created using spr 1.3.6-beta.1
Yeah, I've also kind of wanted what you described (a way to wait for the next activation to complete that was triggered after some point). I keep talking myself out of it. I think we want to look at the specific use cases really carefully. Background tasks exist in part to carry out operations that can take a really long time or are fairly likely to experience transient errors (e.g., because they operate on all sleds). They may take many laps to finish applying a change. In the meantime, more changes may accumulate. This makes it hard to have a notion of linear progress, though it's still important to provide clear status (e.g., "of the 3 DNS servers, 2 are at version 8, which is the latest, and 1 is at version 7, which is an hour old"). I suspect that what most programmatic consumers want is not to wait for a specific activation to complete but rather for some specific set of changes to be applied, which could take many laps. (Complicating it further, they might also want to give up waiting if there was a successful lap and the change was not applied.) |
I think I generally agree with this model for programmatic consumers of background tasks (especially as compared to wicketd, which is much more of a foreground/oneshot approach). I think humans would benefit greatly from being able to wait until the next activation after the trigger point completes. As a human looking at a live system, it can be hard to express exactly the constraint one is waiting for. (You could write Rust code for specific cases, like waiting for a new sled to show up in inventory for the first time, but in general I think you'd need some kind of dynamic query language that covers everything operators could reasonable ask for.) In general, I think transient errors like not being able to talk to a sled will be quite rare -- and if an operator or support tech is actively monitoring the situation, they can observe those errors and kick off another run of the background task. The update engine can also be used for post-mortem analysis as long as the generated events are written to a log file -- we already have code that can read and replay logs in a nice fashion, that we've integrated into wicketd. |
Is this something where we could leave it to the human to poll by hand on whatever condition they care about (e.g., I also think it would be fine to have richer debugging for tasks like you're suggesting. We could have APIs to list recent activations and their statuses and/or to wait for an activation to complete. Or to wait for activation N to complete. I just think there's a ton of stuff we could do here (so there's some risk of scope creep) and some of it could be easily misused to build brittle stuff. So I'm wondering how big a pain point it currently is. If it is painful today in dev then yeah maybe we should do it and see if we can make sure people don't accidentally use it to build programmatic consumers that ought to be checking other conditions instead. |
Created using spr 1.3.6-beta.1
Yeah, let's see how it plays out over the next while as we add more functionality to our background task system, and decide from there. |
Need to wait until 5620 lands. |
I realized that we have this wonderful `SledFilter` enum lying around, and we can just use it in omdb. I decided to shorten "eligible-for-discretionary-services" to "discretionary", which I hope communicates the same meaning but in a shorter manner. Depends on #5615.
This does its basic job of activating a background task, e.g.
inventory_collection
.It is a little unsatisfying because it's a bit hard with the current structure
to do things like:
Providing this kind of progress reporting is one of the kinds of problems we
solved with the update engine, and I'm wondering if it makes sense to try and
integrate that at some point.
Fixes #5058.