-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to do (next gen, SC oriented) remote error serialization for cross host/process propagation? #5
Comments
More hints from |
Thanks to @njsmith for pointing out the traceback serializers in |
I think that we want to have error propagation that somehow includes explicit mention of host boundaries in a readable way. That's probably the relatively easy part of the issue, but I don't want it to be overlooked. The traceback should show when the host/process changes, which means that the code itself may be different. If there were some way to give a good representation of which version of the code it found on the other side as well, that seems really excellent. Of course, if everything really is on the same version of the same code, then it'll be redundant. And we could, when things are happy, potentially unseralize the exceptions and raise them as more than just a wrapper exception with the traceback from the other server, which would be cool. Not something that we can rely on in all cases though, so we have to have a good fallback when things don't match and the exception and traceback don't propagate well. This is all really interesting, and I don't know what I'm talking about, but it looks very neat. |
I think this should mostly be included in a mailbox / address in every message (the actor model way). Right now
In my mind the the primary code that cares about remote errors is an actor's supervisor, and I wonder, should a super care about what version of the code is being run? Is this maybe the concern of something else? For example, if the application required that info couldn't the parent just immediately ask for a version from its child just after spawning? When will it be useful to a super in the general case to know about its child's code version? Maybe in a system where there is hot code swapping like in I think the main question is how much does a remote super need to know about a child's error types / internal code. To me, too much coupling here would mean the super is more part of the app then part of distributed computing system - which maybe is fine in some cases but then won't the super need to have special consideration for details of the child anyway? At face value it would seem to me a super needs to know as much about a child's remote errors as a If a super is supposed to fulfill its conventional role then I think some set of error "classes" might be necessary to help (custom) supervisor authors determine what types of failure recovery (or cancellation) logic is available. Having a set of contracts for what errors should be raised in which situation is something that can be iterated over time if designed right - but still there will be a foreseeable super handlers-to-error types compatibility problem over multiple versions running in the same cluster(s). Anyway, too many new questions 😼! The short assertion is that we already do pack task info in the exception msg and announce / pack the actor uid in the I don't at the moment see any problem with requiring all such remote errors to include the address/actor uid/ task uid info in every error. It's probably just going to make logging system integration that much easier and useful. I also don't see a problem with reconstructing remote errors into local objects other then performance. |
Heh, so we're already kinda requiring the whole uid-in-error-as-msg bit as part of the soon to #357 land and we might as well use the new multi-address support we're experimenting with in #367. Addresses in every error seems like a handy thing for unwinding complex inter-actor-tree service failures especially if we ever get to multi-host supervision APIs down the road .. |
It's like the sloppiest and laziest thing atm..
Doesn't
rpyc
have some fancy way it does this. Seems like there's a homegrown traceback serializer. Here's their theory of operation.I specifically don't want to go down the proxy route (one of
tractor
's tenets) but I think for exceptions it's a special case.The text was updated successfully, but these errors were encountered: