Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"netlink socket stream shut down" infinite loop #4287

Closed
nazar-pc opened this issue Aug 2, 2023 · 9 comments
Closed

"netlink socket stream shut down" infinite loop #4287

nazar-pc opened this issue Aug 2, 2023 · 9 comments

Comments

@nazar-pc
Copy link
Contributor

nazar-pc commented Aug 2, 2023

Summary

I have not yet identify how exactly, but when doing early shutdown of an application I see this printed forever with Ctrl+C not working, so looks like in the main thread somehow:

2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: reading incoming messages    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::codecs: NetlinkCodec: decoding next message    
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker netlink_proto::framed: failed to read from netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }    
2023-08-02 21:07:17.697  WARN tokio-runtime-worker netlink_proto::connection: netlink socket stream shut down    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwarding unsolicited messages to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwaring responses to previous requests to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: handling requests    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: sending messages    
2023-08-02 21:07:17.697  WARN tokio-runtime-worker netlink_proto::connection: error flushing netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }    
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker libp2p_mdns::behaviour: if watch returned an error: rtnetlink socket closed    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: reading incoming messages    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::codecs: NetlinkCodec: decoding next message    
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker netlink_proto::framed: failed to read from netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }    
2023-08-02 21:07:17.697  WARN tokio-runtime-worker netlink_proto::connection: netlink socket stream shut down    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwarding unsolicited messages to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwaring responses to previous requests to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: handling requests    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: sending messages    
2023-08-02 21:07:17.697  WARN tokio-runtime-worker netlink_proto::connection: error flushing netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }    
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker libp2p_mdns::behaviour: if watch returned an error: rtnetlink socket closed    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: reading incoming messages    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::codecs: NetlinkCodec: decoding next message    
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker netlink_proto::framed: failed to read from netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }    
2023-08-02 21:07:17.697  WARN tokio-runtime-worker netlink_proto::connection: netlink socket stream shut down    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwarding unsolicited messages to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwaring responses to previous requests to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: handling requests    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: sending messages    
2023-08-02 21:07:17.697  WARN tokio-runtime-worker netlink_proto::connection: error flushing netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }    
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker libp2p_mdns::behaviour: if watch returned an error: rtnetlink socket closed    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: reading incoming messages    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::codecs: NetlinkCodec: decoding next message    
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker netlink_proto::framed: failed to read from netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }    
2023-08-02 21:07:17.697  WARN tokio-runtime-worker netlink_proto::connection: netlink socket stream shut down    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwarding unsolicited messages to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwaring responses to previous requests to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: handling requests    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: sending messages    
2023-08-02 21:07:17.697  WARN tokio-runtime-worker netlink_proto::connection: error flushing netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }    
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker libp2p_mdns::behaviour: if watch returned an error: rtnetlink socket closed    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: reading incoming messages    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::codecs: NetlinkCodec: decoding next message    
2023-08-02 21:07:17.697 ERROR tokio-runtime-worker netlink_proto::framed: failed to read from netlink socket: Custom { kind: Other, error: "A Tokio 1.x context was found, but it is being shutdown." }    
2023-08-02 21:07:17.697  WARN tokio-runtime-worker netlink_proto::connection: netlink socket stream shut down    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwarding unsolicited messages to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: forwaring responses to previous requests to the connection handle    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: handling requests    
2023-08-02 21:07:17.697 DEBUG tokio-runtime-worker netlink_proto::connection: sending messages    

I think the solution here might be to be prepared for runtime shutdown at any time.

Version

0.52.1, but I observed this on previous versions too

Would you like to work on fixing this bug?

Maybe

@thomaseizinger
Copy link
Contributor

Related: #3390.

@thomaseizinger
Copy link
Contributor

Related: #2591.

@nazar-pc
Copy link
Contributor Author

This happens when during Substrate node initialization I press Ctrl+C, so it has to shut down before it even started properly, in this case above errors are printed until process is killed extrenally.

@thomaseizinger
Copy link
Contributor

I am almost certain that this would be fixed by #2591 because that would move the IO off the main task which is typically the one running in main.rs.

@dkuhnert
Copy link

I just ran into the same issue. I think i tracked it down to the mdns polling if_watch in a loop.
When rtnetlink socket is closed if_watch stream will return an error every time poll is called.
So either the if_watch stream can be changed to return None (and thus end the stream) if the socket is closed or add an break statement to be mdns behaviour when if_watch returns an error.

@nazar-pc
Copy link
Contributor Author

Actually I don't see it in the latest version of libp2p anymore. Are you on older version by any chance?

@dkuhnert
Copy link

I'm on latest:

% cargo tree -i libp2p
libp2p v0.53.2
...

This happend to me in an unittest which runs very quickly. When i run this test on linux it blocks forever but runs fine on macOS.

The if_watch poll never returns Poll::Pending so the while loop in mdns can never complete when the netlink socket is closed.

@dariusc93
Copy link
Member

The if_watch poll never returns Poll::Pending so the while loop in mdns can never complete when the netlink socket is closed.

It does return Poll::Pending but its just behind the ready! macro

https://github.com/mxinden/if-watch/blob/31c4780d803431914aab95ec9139aa8e5795f80c/src/linux.rs#L107-L123

@jxs
Copy link
Member

jxs commented Nov 5, 2024

will close this for now as @nazar-pc mentions it doesn't reproduce anymore.
Anyone feel free to re-open if not

@jxs jxs closed this as completed Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants