Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normal drops? #3

Closed
dimaslv opened this issue Apr 8, 2019 · 1 comment
Closed

Normal drops? #3

dimaslv opened this issue Apr 8, 2019 · 1 comment

Comments

@dimaslv
Copy link

dimaslv commented Apr 8, 2019

Hello!
We see two main sources of drops on our server:

33 drops at tcp_v4_do_rcv+8e (0xffffffff8178619e)
413 drops at skb_release_data+b1 (0xffffffff816ea931)
  1. addr2line does not work with those addresses:
# addr2line -f -e /boot/vmlinuz-4.19.30-1.el7.x86_64  0xffffffff816ea931
addr2line: /boot/vmlinuz-4.19.30-1.el7.x86_64: Warning: Ignoring section flag IMAGE_SCN_MEM_NOT_PAGED in section .bss
??
??:0
# addr2line -f -e /boot/vmlinuz-4.19.30-1.el7.x86_64  0xffffffff8178619e
addr2line: /boot/vmlinuz-4.19.30-1.el7.x86_64: Warning: Ignoring section flag IMAGE_SCN_MEM_NOT_PAGED in section .bss
??
??:0
  1. I see both functions with perf record -g -a -e skb:kfree_skb -F 100:
java 26982 [026] 330543.363561: skb:kfree_skb: skbaddr=0xffff88803dd83a00 protocol=2048 location=0xffffffff816ea931
        ffffffff816e9c9b kfree_skb+0x7b ([kernel.kallsyms])
        ffffffff816ea931 skb_release_data+0xb1 ([kernel.kallsyms])
        ffffffff816e9bf4 skb_release_all+0x24 ([kernel.kallsyms])
        ffffffff816e9c12 __kfree_skb+0x12 ([kernel.kallsyms])
        ffffffff8176c4dc tcp_recvmsg+0x63c ([kernel.kallsyms])
        ffffffff8179fa89 inet_recvmsg+0x59 ([kernel.kallsyms])
        ffffffff816dcb73 sock_recvmsg+0x43 ([kernel.kallsyms])
        ffffffff816df2c7 __sys_recvfrom+0xe7 ([kernel.kallsyms])
        ffffffff816df368 __x64_sys_recvfrom+0x28 ([kernel.kallsyms])
        ffffffff81004150 do_syscall_64+0x60 ([kernel.kallsyms])
        ffffffff81a00088 entry_SYSCALL_64_after_hwframe+0x44 ([kernel.kallsyms])
            7f4cc6340a8b __libc_recv+0x7b (/usr/lib64/libpthread-2.17.so)

java 26719 [013] 330543.363601: skb:kfree_skb: skbaddr=0xffff88814a72a000 protocol=2048 location=0xffffffff8178619e
        ffffffff816e9c9b kfree_skb+0x7b ([kernel.kallsyms])
        ffffffff8178619e tcp_v4_do_rcv+0x8e ([kernel.kallsyms])
        ffffffff817877ac tcp_v4_rcv+0xa6c ([kernel.kallsyms])
        ffffffff8175cd9f ip_local_deliver_finish+0x9f ([kernel.kallsyms])
        ffffffff8175d3df ip_local_deliver+0x6f ([kernel.kallsyms])
        ffffffff8175cf7b ip_rcv_finish+0x8b ([kernel.kallsyms])
        ffffffff8175d4a6 ip_rcv+0x56 ([kernel.kallsyms])
        ffffffff81702dc7 __netif_receive_skb_one_core+0x57 ([kernel.kallsyms])
        ffffffff81702e28 __netif_receive_skb+0x18 ([kernel.kallsyms])
        ffffffff81702f20 process_backlog+0xb0 ([kernel.kallsyms])
        ffffffff81703709 net_rx_action+0x289 ([kernel.kallsyms])
        ffffffff81c000d1 __softirqentry_text_start+0xd1 ([kernel.kallsyms])
        ffffffff81092198 irq_exit+0xe8 ([kernel.kallsyms])
        ffffffff81a01cd9 do_IRQ+0x59 ([kernel.kallsyms])
        ffffffff81a0098f ret_from_intr+0x0 ([kernel.kallsyms])
        ffffffff810b962d finish_task_switch+0x7d ([kernel.kallsyms])
        ffffffff818736a3 __schedule+0x2b3 ([kernel.kallsyms])
        ffffffff81873ca6 schedule+0x36 ([kernel.kallsyms])
        ffffffff818781d8 schedule_hrtimeout_range_clock+0x128 ([kernel.kallsyms])
        ffffffff81878203 schedule_hrtimeout_range+0x13 ([kernel.kallsyms])
        ffffffff812dad4d ep_poll+0x33d ([kernel.kallsyms])
        ffffffff812daee9 do_epoll_wait+0xb9 ([kernel.kallsyms])
        ffffffff812daf1e __x64_sys_epoll_wait+0x1e ([kernel.kallsyms])
        ffffffff81004150 do_syscall_64+0x60 ([kernel.kallsyms])
        ffffffff81a00088 entry_SYSCALL_64_after_hwframe+0x44 ([kernel.kallsyms])
            7f4cc5c49483 [unknown] (/usr/lib64/libc-2.17.so)
            7f4cb8d3d3a4 [unknown] (/tmp/perf-25546.map)
              6cf805333a [unknown] ([unknown])

If I understood correctly, both cases are not problems as it's just drop of already consumed packet. Am I right? Is it possible to exclude such cases from reporting?

@nhorman
Copy link
Owner

nhorman commented Apr 19, 2019

strictly speaking, the first drop in tcp_v4_do_rcv is due to the stack having received a packet for a tcp connection that was not established on the host, making that an expected drop.

The drop in skb_release_data is because you received a fragmented frame, and all the fragments are getting freed. I would argue that that call should be modified to consume_skb, to silence it, but it also is expected.

As for filtering, I've always meant to add a filter feature but never found the time. If you would like to look into it, you would be more than welcome to submit a patch

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants