Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

need to cleanup sockets for clients do not end cleanly. #607

Closed
troy2914 opened this issue Dec 29, 2021 · 4 comments · Fixed by #612
Closed

need to cleanup sockets for clients do not end cleanly. #607

troy2914 opened this issue Dec 29, 2021 · 4 comments · Fixed by #612
Assignees
Labels
backport Should be backported to previous releases bug Something isn't working
Projects
Milestone

Comments

@troy2914
Copy link
Member

We had the situation where 68 workers were working on one ip. Looking at netstat
it revealed that all 68 to that ip where in the state FIN_WAIT2. To which the client should
have responded with FIN,ACK, but apparently did not.

Expected behaviour
after 5 minutes in FIN_WAIT2 the socket should be force-ably cleaned up.

IRRd version you are running
IRRd -- version 4.1.8

@mxsasha
Copy link
Collaborator

mxsasha commented Dec 29, 2021

Sounds good, will dig into it.

@mxsasha mxsasha self-assigned this Dec 29, 2021
@mxsasha mxsasha added backport Should be backported to previous releases bug Something isn't working labels Dec 29, 2021
@mxsasha mxsasha added this to To do in NTT 2021 Dec 29, 2021
@mxsasha mxsasha modified the milestones: Release 4.3, IRRdv4 phase 3 Dec 29, 2021
@mxsasha
Copy link
Collaborator

mxsasha commented Jan 13, 2022

@troy2914 do you have some idea of how long these sockets were lingering? Are we talking an accumulation to 68 over a time frame of hours? More? A minute?

@mxsasha mxsasha moved this from To do to In progress in NTT 2021 Jan 13, 2022
@mxsasha
Copy link
Collaborator

mxsasha commented Jan 13, 2022

Initial thoughts: the close was initiated from IRRD, which is normal, but the remote end did not finish the closing of the socket, making the IRRD end stuck in FIN_WAIT2 (state diagram), waiting for a FIN from the remote end. It is possible that this line (docs for close(), docs for shutdown()) blocks at this point. If the IRRD server process is blocked, the socket is still attached, preventing the kernel from cleaning it up even after tcp_fin_timeout.

Possible solution: set a timeout on the socket, which apparently we don't do at all in the whois TCP server. Would need a bit of testing though.

@mxsasha
Copy link
Collaborator

mxsasha commented Jan 17, 2022

I haven't really been able to reproduce this, so I'm planning to move forward with settimeout on the socket. It definitely seems to need a highly rare kind of misbehaving client/OS/network.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport Should be backported to previous releases bug Something isn't working
Projects
No open projects
NTT 2021
In progress
Development

Successfully merging a pull request may close this issue.

2 participants