Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix(iroh-net)!: improve magicsock's shutdown story (#2227)
## Description - Waits for connections to actually be closed by calling `quinn::Endpoint::wait_idle` after close. - Consumes the `MagicEndpoint` in an effort to express the fact that closing an endpoint is a terminal operation. - Adds logging to closing the connections in case it's deemed necessary - Changes `MagicSock::poll_recv` to return `Poll::Pending` on close instead of an error. More info on the logic behind this change below. - Changes `MagicSock::poll_send` to not error if nothing is being sent. ## Breaking Changes `MagicEndpoint::close` now consumes the endpoint. ## Notes & open questions Shutdown `MagicSock` when quinn is done with it has proven something we can't do reliably. - Drop based solutions (close when the `MagicSock` is dropped) are unreliable because the socket is held by multiple components on `quinn`, including spawned detached tasks. There is no really any way to trigger the dropping of these tasks/components from outside `quinn`, and would require quite en effort to change it to happen inside `quinn`. - Close based solutions were rejected. The objective here was to stop polling the socket when the endpoint was both closed (close called) and all connections gracefully finalized. The reasoning here is that `quinn` both receives and sends frames after close to read new connection attempts and gracefully reject them. This is a fair argument on their side, despite it being clearly a limitation for a reliable freeing of resources from `quinn`. - Taking into account the fact that the socket _will_ be polled after closed, both to send and receive, returning an error from these will always produce the logging error `quinn::endpoint: I/O error: connection closed` we have been chasing, _even_ after the `quinn::Endpoint` has been dropped. Therefore changing `poll_recv` to return `Poll::Pending` addresses this annoyance. - Note that the part about _gracefully_ shutting down is actually done by calling `quinn::Endpoing::wait_idle` and that the (now averted) log error in `quinn` doesn't really tells us whether shutdown was or not gracefull. - Note that the above point creates an API disparity between `poll_send` and `poll_recv`: one will error on close, the other will return `Pending`. I find this OK if `MagicSock` is simply a part of our implementation, but right now it's also part of our public API. I wonder if the `MagicSock` makes sense to users on its own or if we should remove it from the public API. - NOTE that conversation with `quinn`'s maintainers to get a better api that allows to understand when all resources as freed has started. But this is playing the long game and we need to solve this on our end for now. ## Change checklist - [x] Self-review. - [x] Documentation updates if relevant. - [ ] ~Tests if relevant.~ - [x] All breaking changes documented.
- Loading branch information