Restore check for closed TCP connection in exporter process #360

antoninbas · 2024-09-04T18:41:48Z

This is a partial reversal of d072ed8.
It turns out that the check can actually be useful as it can detect that the collector ("server" side) has closed its side of the connection, and this can be used as a signal to close our side of the connection as well. This can happen when a process using our collector implementation is closed, but no TCP RST is received when sendign a data set (in particular, this can happen when running in K8s and using a virtual IP to connect to the collector). This check can detect the issue much faster than relying on a keep-alive timeout. Furthermore, a client of this library could end up blocking if the connection has not timed out yet and the send buffer is full.

pkg/exporter/process_test.go

yuntanghsu · 2024-09-05T20:26:52Z

Just want to double check,
I think the client will close the connection as it gets error and reconnect it when the connection is closed by server during sending the data?

antoninbas · 2024-09-05T22:04:21Z

Just want to double check, I think the client will close the connection as it gets error and reconnect it when the connection is closed by server during sending the data?

If you are referring to what happens (after merging this patch) when the server closes the connection then: after at most 10s, the code will detect that the connection is closed, closeConnToCollector will be called (internally); the next time there is data to be sent, there will be an error (trying to send on closed connection), and the client will try to reconnect.

This is a partial reversal of d072ed8. It turns out that the check can actually be useful as it can detect that the collector ("server" side) has closed its side of the connection, and this can be used as a signal to close our side of the connection as well. This can happen when a process using our collector implementation is closed, but no TCP RST is received when sendign a data set (in particular, this can happen when running in K8s and using a virtual IP to connect to the collector). This check can detect the issue much faster than relying on a keep-alive timeout. Furthermore, a client of this library could end up blocking if the connection has not timed out yet and the send buffer is full. Signed-off-by: Antonin Bas <antonin.bas@broadcom.com>

Signed-off-by: Antonin Bas <antonin.bas@broadcom.com>

This is a partial reversal of d072ed8. It turns out that the check can actually be useful as it can detect that the collector ("server" side) has closed its side of the connection, and this can be used as a signal to close our side of the connection as well. This can happen when a process using our collector implementation is closed, but no TCP RST is received when sendign a data set (in particular, this can happen when running in K8s and using a virtual IP to connect to the collector). This check can detect the issue much faster than relying on a keep-alive timeout. Furthermore, a client of this library could end up blocking if the connection has not timed out yet and the send buffer is full. Signed-off-by: Antonin Bas <antonin.bas@broadcom.com>

vmwclabot added the cla-not-required label Sep 4, 2024

antoninbas requested review from yuntanghsu and heanlan September 4, 2024 18:41

antoninbas force-pushed the restore-tcp-closed-conn-check branch from f852931 to 6149e1c Compare September 4, 2024 18:45

heanlan reviewed Sep 5, 2024

View reviewed changes

pkg/exporter/process_test.go Show resolved Hide resolved

yuntanghsu approved these changes Sep 5, 2024

View reviewed changes

antoninbas added 2 commits September 5, 2024 15:32

Add unit test

0521c6a

Signed-off-by: Antonin Bas <antonin.bas@broadcom.com>

antoninbas force-pushed the restore-tcp-closed-conn-check branch from 6149e1c to 0521c6a Compare September 5, 2024 22:32

antoninbas merged commit 2c76c1e into vmware:main Sep 5, 2024
7 checks passed

antoninbas deleted the restore-tcp-closed-conn-check branch September 5, 2024 23:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restore check for closed TCP connection in exporter process #360

Restore check for closed TCP connection in exporter process #360

antoninbas commented Sep 4, 2024

yuntanghsu commented Sep 5, 2024

antoninbas commented Sep 5, 2024

Restore check for closed TCP connection in exporter process #360

Restore check for closed TCP connection in exporter process #360

Conversation

antoninbas commented Sep 4, 2024

yuntanghsu commented Sep 5, 2024

antoninbas commented Sep 5, 2024