-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wait on NestedSubsystem #37
Comments
Hey! That was the PR: https://github.com/Finomnis/tokio-graceful-shutdown/pull/36/files The reason I stopped this is because it inherently created a couple of race conditions. Like, where should potential errors be delivered? To The tldr; of that is that instead of making subsystems joinable, I might instead make What are your thoughts? |
Yeah that sounds good. Those are the semantics I want from NestedSubsystem anyways, I was wondering why it has to be a different type from Toplevel (I would expect just a single Subsystem type). I guess you'd have to figure out what semantics of |
Nested subsystems are part of the subsystem tree that propagates errors up to the toplevel object. If you think that's a bad design decision we can rethink the API in general. Toplevel objects, on the other hand, would catch errors and force you to consume them. Jah I thought about So I think all I will add is, additionally to |
One of my acceptance criteria for this task was even that I rewrite the example with the restarting subsystem, which would be exactly your usecase, I guess. |
I'll continue working on it probably next week, I sadly don't have much time left this week |
Certainly would require some thought, but at least to me having just one type and recursive structure seems clearer. Still seems possible to create the same propagation semantics as today with some additional code tweaks to the API. Example of how I might imagine it working:
My 2c as someone new to the APIs, but I haven't spent too much time thinking through it so certainly possible the above doesn't make sense/isn't possible for some reason :) |
In general this is the right thought, the problem is error handling. One could of course say error handling should be done by the user and all we provide is the shutdown token, but I think then it is easier to use tokio's shutdown token directly. This will be the only difference between a nested subsystem and a nested toplevel: subsystems propagate errors upwards, while toplevels collect and process them. Maybe you have an idea of how to combine those two, I am running out of ideas. I had the idea of having a But still, there would be two different structs, one for a subsystem that can be joined and which does not propagate errors up, and the other one that can't be joined and forwards errors to the parent. (as their API would be different) |
I imagine each subsystem would be responsible for propagating errors manually up the chain if necessary. For most usage I expect calling
Maybe it's not ideal for some types of use cases though. |
I'm thinking about removing error propagation entirely, because after thinking about it a little more, you really don't gain much from automatic error-propagation to the top. The new api could resemble something like this:
I think the term It's important to make the joinhandles
Implementing your restart-a-subsystem-until-failure should also be trivial with this. The reason I kept What are your thoughts about that? Edit: |
I might attempt to re-write this library within the next weeks, by simply implementing an API skeleton with a bunch of |
Yeah, that looks good to me! Agreed that both suggestions are fairly similar. While searching around for some rust async stuff, I had also come across tokio-rs/tokio#1879 and this proposal sounds quite similar to Structured Concurrency. I guess there's nothing new under the sun (or perhaps, great minds think alike 😛) Disclaimer: I haven't read through the issue in detail or any links. The last comment does mention some libs though, probably worth seeing how they compare to this proposal. |
I think structured concurrency, or specifically the 'nursery' pattern is pretty cool. I don't think it is strongly connected to this library though, this library should leave the specific use case up to the user imo and be as non-specialized as possible. |
OK, so just to clarify for me: the main difference is that structured concurrency requires nested scopes? And this proposal does not, because you can pass around the |
Tbh I haven't read that deeply into it. I'm familiar with the concepts, but both of them aren't really relevant for this library, so I haven't dealt with them in detail yet. |
Turns out there are problems with the simplified API. The biggest one is that the code the user will write is very quick to revert to cancellation instead of graceful shutdown. If we say the user has to forward errors, then that means that the user also has to initiate shutdown requests when a nested error happens. Example subsystem structure: SubsysA
-> SubsysB
-> SubsysC1
-> SubsysC2 If subsystem But there is a high chance that tokio::select!(
_ => c1_joinhandle => (...something...),
_ => c2_joinhandle => (...something...),
) This code would cancel So how would you implement the behaviour "If either C1 or C2 throws an error, shut down the entire Program"? async fn subsys_b(subsys: SubsystemHandle) -> Result<()> {
let (c1_joinhandle, c1) = subsys.start("SubsysC1", subsys_c1);
let (c2_joinhandle, c2) = subsys.start("SubsysC2", subsys_c2);
let subsys_clone1 = subsys.clone();
let subsys_clone2 = subsys.clone();
let (err1, err2) = tokio::join!(
async move {
c1_joinhandle.await.map_err(|err| {
subsys_clone1.global_shutdown();
err
})
},
async move {
c2_joinhandle.await.map_err(|err| {
subsys_clone2.global_shutdown();
err
})
},
);
combine_errors(err1, err2)
} For one, this is a mess and not the programming style I'd like to cause with my API.
|
Those are good points. Would some helper like
I believe this would work for my use case, but not sure if it addresses all your concerns and whether there are any other edge cases. |
While that would work, we are then back at square one where we cannot react to the termination of a single subsystem any more. I remind you that errors are non-copyable, so a singular error path is required. That would mean that joining a single subsystem would have to remove its error path from the handle's join call. I think we are slowly shifting back to the existing pattern then, and I feel like making the original |
Ah of course, I keep forgetting that because I don't need it and difficult to understand for me without a specific example. In that case nesting |
I agree with your naming feedback. I'm open to suggestions. |
I didn't give a suggestion because I'm not good at naming :). Name should make it clear that the only real difference between the two (AIUI) is that one of them propagates errors up the chain and one doesn't. Some initial ideas are |
|
@viveksjain Would this help you? |
Yup, that looks great! |
@viveksjain Released as |
Happy to, will try to do so over the next week or so |
I did get the chance to try out the beta version, and happy to say that it works great. Helped me clean up my code quite a bit. Feel free to close this issue! One observation I will mention is that in most cases, I would like a newly started subsystem to /// Start a nested subsystem that `select!`s on `subsys.on_shutdown_requested()` to stop automatically.
/// `subsystem_to_start` must have type `Future<Output=anyhow::Result<()>>`.
#[macro_export]
macro_rules! start {
($subsys:expr, $name:expr, $subsystem_to_start:expr) => {
let subsys_clone = $subsys.clone();
$subsys.start($name, move |_h: SubsystemHandle| async move {
tokio::select! {
r = $subsystem_to_start => r,
_ = subsys_clone.on_shutdown_requested() => Ok::<(), anyhow::Error>(())
}
});
};
} Will leave it up to you on whether such a feature makes sense to be in the lib or not (in which case it can probably be another function on |
Moved to new issue. |
Thanks for writing this library! It seems helpful for my async code. I couldn't figure out how to handle this use case though: I have a nested subsystem, which will shut itself down on error. I want to retry from the top level in a loop. Something like
How would I do the
nested.wait()
part?The text was updated successfully, but these errors were encountered: