Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inputs.socketstat plugin: CLOSE-WAIT tcp traffic is not supported #14511

Closed
mingfer opened this issue Dec 29, 2023 · 9 comments
Closed

inputs.socketstat plugin: CLOSE-WAIT tcp traffic is not supported #14511

mingfer opened this issue Dec 29, 2023 · 9 comments
Labels
bug unexpected problem or unintended behavior

Comments

@mingfer
Copy link

mingfer commented Dec 29, 2023

Relevant telegraf.conf

[[inputs.socketstat]]
  ## ss can display information about tcp, udp, raw, unix, packet, dccp and sctp sockets
  ## Specify here the types you want to gather
  protocols = [ "tcp" ]

Logs from Telegraf

panic: runtime error: index out of range [4] with length 4

goroutine 27 [running]:
github.com/influxdata/telegraf/plugins/inputs/socketstat.getTagsAndState({0xc000361691, 0x3}, {0xc00051a000, 0x4, 0x4}, {0x123d5f8, 0xc0003f8f90})
	github.com/influxdata/telegraf@v1.29.1/plugins/inputs/socketstat/socketstat.go:159 +0xc5a
github.com/influxdata/telegraf/plugins/inputs/socketstat.(*Socketstat).parseAndGather(0xc0001187e0, {0x123de70, 0xc0000f6fc0}, 0x3?, {0xc000361691, 0x3})
	github.com/influxdata/telegraf@v1.29.1/plugins/inputs/socketstat/socketstat.go:133 +0x245
github.com/influxdata/telegraf/plugins/inputs/socketstat.(*Socketstat).Gather(0xc0001187e0, {0x123de70, 0xc0000f6fc0})
	github.com/influxdata/telegraf@v1.29.1/plugins/inputs/socketstat/socketstat.go:58 +0xc7
github.com/influxdata/telegraf/agent.(*Agent).testRunInputs.func2(0xc000118840)
	github.com/influxdata/telegraf@v1.29.1/agent/agent.go:516 +0x2e9
created by github.com/influxdata/telegraf/agent.(*Agent).testRunInputs in goroutine 25
	github.com/influxdata/telegraf@v1.29.1/agent/agent.go:485 +0xca

System info

Telegraf 1.29.1 Linux 4.15.0-136-generic #140-Ubuntu

Docker

No response

Steps to reproduce

  1. run ./telegraf --config telegraf.conf --input-filter socketstat --test
  2. Returns an error message like the log, and the data that caused the error is CLOSE-WAIT1 0 192.xx.xxx.31:54994 192.xx.xx.31:22511 cubic wscale:7,7 rto:204 rtt:1.364/0.159 ato:40 mss:32768 pmtu:65535 rcvmss:536 advmss:65483 cwnd:10 bytes_acked:62173 bytes_received:373033 segs_out:10365 segs_in:5186 data_segs_out:5181 data_segs_in:5181 send 1921.9Mbps lastsnd:3340 lastrcv:3340 lastack:1272 pacing_rate 3842.0Mbps delivery_rate 7085.0Mbps app_limited busy:6824ms rcv_rtt:261404 rcv_space:65594 rcv_ssthresh:65495 minrtt:0.918
  3. Code expected CLOSE-WAIT 1 but got CLOSE-WAIT1
    ...

Expected behavior

socketstat,host=linux-31,local_addr=192.xx.xx.31,local_port=49884,node_id=cssp@mock-resource,proto=tcp,remote_addr=192.xxx.xx.44,remote_port=1521 bytes_acked=4792i,bytes_received=24542i,data_segs_in=70i,data_segs_out=70i,recv_q=0i,segs_in=73i,segs_out=77i,send_q=0i,state="CLOSE-WAIT" 1703839879000000000

Actual behavior

Got panic error.

Additional info

No response

@mingfer mingfer added the bug unexpected problem or unintended behavior label Dec 29, 2023
@mingfer mingfer changed the title CLOSE-WAIT tcp traffic is not supported inputs.socketstat plugin: CLOSE-WAIT tcp traffic is not supported Dec 29, 2023
@powersj
Copy link
Contributor

powersj commented Jan 2, 2024

Hi,

That output is not the same format as other messages, specifically the first few columns:

CLOSE-WAIT1 0 192.xx.xxx.31:54994 192.xx.xx.31:22511

A typical message responds with two numeric columns Recv-Q and Send-Q before the addresses, you seem to have a single one, see examples online:

If you look at the output of ss -in --tcp as well, what do you see? Is this something you can trigger?

@powersj powersj added the waiting for response waiting for response from contributor label Jan 2, 2024
@mingfer mingfer closed this as not planned Won't fix, can't repro, duplicate, stale Jan 3, 2024
@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Jan 3, 2024
@mingfer mingfer reopened this Jan 3, 2024
@mingfer
Copy link
Author

mingfer commented Jan 3, 2024

Hi,

That output is not the same format as other messages, specifically the first few columns:

CLOSE-WAIT1 0 192.xx.xxx.31:54994 192.xx.xx.31:22511

A typical message responds with two numeric columns Recv-Q and Send-Q before the addresses, you seem to have a single one, see examples online:

If you look at the output of ss -in -- tcp as well, what do you see? Is this something you can trigger?

thanks,
There are two numeric columns, there is no space between State and Recv-Q.

CLOSE-WAIT1 0 192.xx.xxx.31:54994 192.xx.xx.31:22511

  • CLOSE-WAIT is State
  • 1 is Recv-Q

The result of executing ss --in --tcp|grep "CLOSE-WAIT":

image

And the three states of fin-wait-1, fin-wait-2, and close-wait will trigger errors.

@powersj
Copy link
Contributor

powersj commented Jan 3, 2024

There are two numeric columns, there is no space between State and Recv-Q.

Aaah thank you for pointing that out, totally missed that. Well that won't work with our parsing which splits on spaces of course :)

And the three states of fin-wait-1, fin-wait-2

Hmm so it is any of the states that are 10 in length? time-wait would be the next longest, does it have this issue?

@powersj powersj added the waiting for response waiting for response from contributor label Jan 3, 2024
@mingfer
Copy link
Author

mingfer commented Jan 3, 2024

There is no problem with other states(such as time-wait), only these states with a length of 10 have problems.

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Jan 3, 2024
@powersj
Copy link
Contributor

powersj commented Jan 3, 2024

Hmm I'm wondering if this is an issue with only your version of ss:

❯ ss -in --tcp | grep -i close
CLOSE-WAIT 25     0      192.168.1.160:48264   170.114.52.83:443 
CLOSE-WAIT 25     0      192.168.1.160:53336    170.114.52.2:443 
CLOSE-WAIT 25     0      192.168.1.160:53322    170.114.52.2:443 

Based on your kernel, I assume this is Ubuntu 20.04 18.04? Do you run any other distros and/or versions?

@mingfer
Copy link
Author

mingfer commented Jan 4, 2024

I tried it on different OS, It may really be related to my system version😅.

OS ss version The character length of the state field
Linux version 4.15.0-136-generic (buildd@lcy01-amd64-029) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #140-Ubuntu SMP Thu Jan 28 05:20:47 UTC 2021 iproute2-ss180129 10 (The issue is triggered)
Linux version 4.15.0-201-generic (buildd@lcy02-amd64-110) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #212-Ubuntu SMP Mon Nov 28 11:29:59 UTC 2022 iproute2-ss180129 21
Red Hat 4.8.5-11 iproute2-ss130716 11
ubuntu1~22.04 iproute2-5.15.0 22

@powersj
Copy link
Contributor

powersj commented Jan 4, 2024

For the first and second lines, is the only difference being the kernel version? The ss version is the same?

Are you in a position to update and verify that resolve the issue?

I'm not opposed to carrying a fix in case this occurs again, but at the same time I certainly would like to avoid some extra logic for a bug that is already fixed.

Thanks!

@powersj powersj added the waiting for response waiting for response from contributor label Jan 4, 2024
@mingfer
Copy link
Author

mingfer commented Jan 5, 2024

Thank you so much.

For some reasons, we are temporarily unable to upgrade the kernel version. At the same time, we made a simple fix on the current socketstat plugin and released a special Telegraf version to be compatible with this device. Of course, it would be great to be able to fix this issue, maybe I'm not alone.

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Jan 5, 2024
@powersj
Copy link
Contributor

powersj commented Jan 5, 2024

At the same time, we made a simple fix on the current socketstat plugin and released a special Telegraf version to be compatible with this device.

Glad to hear that works. Thanks for working this out with me!

We chatted about this and would prefer not to maintain a fix for this given the old, soon to be out of standard support OS version. As such I am going to close this for now.

@powersj powersj closed this as not planned Won't fix, can't repro, duplicate, stale Jan 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior
Projects
None yet
Development

No branches or pull requests

2 participants