-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tcptop #692
Comments
Polling of ss -ntiThis is to use the recent RFC-4898 additions to tcp_info, namely tcpEStatsAppHCThruOctetsAcked and tcpEStatsAppHCThruOctetsReceived. This will also need to BPF trace tcp_close() as well, to catch short-lived sessions or those that closed during the interval. One would think they'd stay around in TIME-WAIT, but they aren't visible in ss -nti, as I believe the full socket is no longer kept (as a measure to survive DoS attacks). Another problem is the overhead; here's a busy production server:
So about 80 ms of CPU time each time we poll this. Some servers will be much better (fewer connections), some a bit worse. A tcptop with a 1 second refresh is looking at at least 8% of one CPU overhead on this server. BPF TCP send/receiveBack to tracing tcp_sendmsg and tcp_recvmsg (or tcp_cleanup_rbuf, which has the socket and correct size in one probe). I'd done a stress test of 600k TCP events/second, and measured 10% CPU overhead over 8 CPUs (or 80% of one CPU). So the overhead can become significant. But is that likely? I've used funccount (from my ftrace perf-tools) to measure the total rate of tcp_sendmsg & tcp_cleanup_rbuf on three common production services, and can extrapolate the expected overhead. They are:
Now for the ss -nti cost on these services:
So while the BPF cost can be significant during stress tests, for real workloads it's expected to be low overhead, and even faster than ss -nti polling (at a 1 second interval). |
'ss' time in excessively high. |
one of the systems:
This is why I was using ftrace to count the kernel function rates. |
I'll close the issue but we can still discuss... |
but the problem is not in the kernel or netlink interface:
so I still think tcp_info is a proper way forward. |
Ok, but still, 171 ms in sys? We'll have a lot less user time, but still need to store all byte counts in a session hash and do deltas to see if anything changed each interval. And we still have to trace tcp_close() to catch short lived. It's all going to add up. |
At least:
Maybe more columns as well.
I submitted one version, and it was discussed in #690, but I removed it from that PR while considering an alternate approach. I'm now investigating two approaches:
The text was updated successfully, but these errors were encountered: