add tcp congestion status duration statistic tool #3899

jackygam2001 · 2022-03-07T03:08:10Z

add tcp congestion contol status duration statistic tool, and it can be used to evalute the networking and congestion algorithm performace.

…tool add tcp congestion status duration statistic tool

iamkafai · 2022-03-09T19:28:27Z

The use case makes sense to me. The implementation is not straightforward though because of the kernel missing a tracepoint in tcp_set_ca_state(), so it has to resort to kprobe and kretprobe on all functions calling tcp_set_ca_state().

".set_state" of "struct tcp_congestion_ops" is implemented for some popular cc like cubic, bbr, and dctcp. That will be a more stable kprobe points.

If you are interested, a tracepoint in tcp_set_ca_state() may be a good add to the kernel considering it is not the hot path. This future kernel work does not need to derail the current bcc pull request.

jackygam2001 · 2022-03-11T00:52:59Z

The use case makes sense to me. The implementation is not straightforward though because of the kernel missing a tracepoint in tcp_set_ca_state(), so it has to resort to kprobe and kretprobe on all functions calling tcp_set_ca_state().

".set_state" of "struct tcp_congestion_ops" is implemented for some popular cc like cubic, bbr, and dctcp. That will be a more stable kprobe points.

If you are interested, a tracepoint in tcp_set_ca_state() may be a good add to the kernel considering it is not the hot path. This future kernel work does not need to derail the current bcc pull request.

yes, currently the upstream kernel only has inline function tcp_set_ca_state(), which is optimized by compiler when building and can not be probed. So in my implementation i have to kprobe and kretprobe on all functions calling it.

brendangregg · 2022-03-12T06:05:05Z

Thanks; Several comments:

Name is too long; should be tcpcong or something.
It has 20 k{ret}probes plus a lot of struct digging, which would make it by far the most brittle tool in all of bcc. Most kprobe tools use between 1 and 6 kprobes. The prior biggest is zfsdist/zfsslower, which has 16, but only attaches to 6 at a time depending on the software version, and it also does no struct digging. (While BTF can help solve struct members moving, it can't solve logic changes.) Also, looking at these tcp functions:

grep attach_k.*probe tcpcongestdura.py 
b.attach_kprobe(event="tcp_try_keep_open",
b.attach_kretprobe(event="tcp_try_keep_open",
b.attach_kprobe(event="tcp_enter_cwr", fn_name="trace_entry_tcp_enter_cwr")
b.attach_kretprobe(event="tcp_enter_cwr",
b.attach_kprobe(event="tcp_process_tlp_ack",
b.attach_kretprobe(event="tcp_process_tlp_ack",
b.attach_kprobe(event="tcp_enter_recovery",
b.attach_kretprobe(event="tcp_enter_recovery",
b.attach_kprobe(event="tcp_enter_loss", fn_name="trace_entry_tcp_enter_loss")
b.attach_kretprobe(event="tcp_enter_loss",
b.attach_kprobe(event="tcp_simple_retransmit",
b.attach_kretprobe(event="tcp_simple_retransmit",
b.attach_kprobe(event="tcp_try_undo_recovery",
b.attach_kretprobe(event="tcp_try_undo_recovery",
b.attach_kprobe(event="tcp_try_undo_loss",
b.attach_kretprobe(event="tcp_try_undo_loss",
b.attach_kprobe(event="tcp_fastretrans_alert",
b.attach_kretprobe(event="tcp_fastretrans_alert",
b.attach_kprobe(event="tcp_disconnect", fn_name="trace_entry_tcp_enter_open")
b.attach_kretprobe(event="tcp_disconnect",

I guess this is only really tracing 10 functions as it's doing both kprobe and kretprobe.

Things like tcp_disconnect() I'm not too worried about as it's in tcp_prot, and isn't likely going to change, and tcp_enter_loss() is in tcp.h. But others like tcp_try_undo_loss() feel like gritty internals. I'm ok with one or two such functions in a tool, but if it ends up being several it's a pain to maintain.

So my two questions are: How many of these are unstable internal functions, and does it need to trace them all or can you pop up a stack frame and trace some more-stable parent function to do the same job? I'm not saying this is a blocker, I just want to be thoughtful about the future maintenance.

How does it compare to tcp_probe and other TCP congestion analysis tools? (Other people will no doubt ask.)
Output columns look misaligned by one.
Default output is > 80 chars, but fortunately only by 5. This does make it hard (and sometimes impossible) to include the output in things like published books and printed articles, but I don't see an easy fix.
Columns are in "us", but the numbers are so big it's confusing. Would it make sense to make them "ms" by default? Have a switch to output "us".
I think the opening example in the examples file is missing a sentence or more to explain why the output matters. Which of these columns is evidence of a problem, such that clients are blocked waiting? Help a newcomer interpret what they are seeing.
These numbers are great, but how would I double check that they are right? Have you already got some fields in nstat/netstat or something to compare them to, at least as a sanity check? (They won't have latency totals, but they may have counts.) Just looking for a way to sanity check the output.

jackygam2001 · 2022-03-16T05:42:51Z

@brendangregg ,
thanks for your comments!
For comment 2, you are right. I have checked kernel code, most of cc status change functions are called by 5 functions(tcp_fastretrans_alert, tcp_enter_cwr, tcp_enter_loss, tcp_enter_recovery, tcp_process_tlp_ack), so i will update the code in next pull request. I think @iamkafai 's advice is very good, i will try to add tracepoint for tcp_set_ca_state to upstream kernel.
For comment 3, i think the tcp_probe is for trace the connection's congestion window update, but this tool is for tracing the congestion control status change and duration. you know the cwnd's update may not cause status's change, so they are two different tools.
For comment 8, i don't use the nstat/netstat to validate the tool, as those tools has only summary for all tcp sockets. Actually i use packetdrill script which was provided by google on github to check the output of my tool.
For other comments, i will update the code in next pull request. thanks very much!

…statistics tool

yonghong-song

A lot of inlining COULD happen for the probe functions used in this tool. So I strongly suggest you to implement the tracepoint as Martin suggested. The tool can still proceed. But with tracepoint in the kernel, we can have an alternative better implementation for future kernels.

tests/python/test_tools_smoke.py

tools/tcpcong.py

jackygam2001 · 2022-03-17T08:00:22Z

@yonghong-song ,
thanks for your comments, yes, i will prepare the tracepoint for tcp_set_ca_state patch to upstream kernel. And will update my code for other comments in next commit. thanks very much!

yonghong-song

Most of okay. A few minor comments.

man/man8/tcpcong.8

tools/tcpcong.py

tools/tcpcong_example.txt

tools/tcpcong.py

The congestion status of a tcp flow may be updated since there is congestion between tcp sender and receiver. It makes sense for adding tracepoint for congestion status update function to evaluate the performance of network and congestion algorithm. Link: iovisor/bcc#3899 Signed-off-by: jackygam2001 <jacky_gam_2001@163.com>

The congestion status of a tcp flow may be updated since there is congestion between tcp sender and receiver. It makes sense to add tracepoint for congestion status set function to summate cc status duration and evaluate the performance of network and congestion algorithm. the backgound of this patch is below. Link: iovisor/bcc#3899 Signed-off-by: Ping Gan <jacky_gam_2001@163.com>

The congestion status of a tcp flow may be updated since there is congestion between tcp sender and receiver. It makes sense to add tracepoint for congestion status set function to summate cc status duration and evaluate the performance of network and congestion algorithm. the backgound of this patch is below. Link: iovisor/bcc#3899 Signed-off-by: Ping Gan <jacky_gam_2001@163.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://lore.kernel.org/r/20220406010956.19656-1-jacky_gam_2001@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>

jackygam2001 and others added 2 commits March 7, 2022 10:52

add tcp congestion status duration statistic tool

5adbf02

Merge pull request #1 from jackygam2001/tcp-congestion-duration-stat-…

e578e15

…tool add tcp congestion status duration statistic tool

jackygam2001 requested review from drzaeus77, yonghong-song, davemarchevsky, brendangregg, goldshtn and 4ast as code owners March 7, 2022 03:08

Merge branch 'iovisor:master' into master

387b87c

jackygam2001 and others added 2 commits March 16, 2022 16:09

Merge branch 'iovisor:master' into master

e2c61c9

update the kprobe and kretprobe functions and output for tcp congest …

b8d199d

…statistics tool

yonghong-song reviewed Mar 17, 2022

View reviewed changes

Merge branch 'iovisor:master' into master

b34c38d

jackygam2001 added 2 commits March 17, 2022 16:58

update the code for the tool

c580652

add kfunc/kretfunc for the tool when it's available

d6929d6

yonghong-song reviewed Mar 22, 2022

View reviewed changes

jackygam2001 and others added 4 commits March 22, 2022 19:34

Merge branch 'iovisor:master' into master

21fce0e

update the tool for supporting ipv6

b59d237

Merge branch 'iovisor:master' into master

9f45a43

update tool's usage comment

ecd1e4e

chenhengqi reviewed Mar 23, 2022

View reviewed changes

tools/tcpcong.py Show resolved Hide resolved

yonghong-song merged commit a58c795 into iovisor:master Mar 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add tcp congestion status duration statistic tool #3899

add tcp congestion status duration statistic tool #3899

jackygam2001 commented Mar 7, 2022

iamkafai commented Mar 9, 2022

jackygam2001 commented Mar 11, 2022

brendangregg commented Mar 12, 2022

jackygam2001 commented Mar 16, 2022 •

edited

yonghong-song left a comment

jackygam2001 commented Mar 17, 2022

yonghong-song left a comment

add tcp congestion status duration statistic tool #3899

add tcp congestion status duration statistic tool #3899

Conversation

jackygam2001 commented Mar 7, 2022

iamkafai commented Mar 9, 2022

jackygam2001 commented Mar 11, 2022

brendangregg commented Mar 12, 2022

jackygam2001 commented Mar 16, 2022 • edited

yonghong-song left a comment

Choose a reason for hiding this comment

jackygam2001 commented Mar 17, 2022

yonghong-song left a comment

Choose a reason for hiding this comment

jackygam2001 commented Mar 16, 2022 •

edited