bpf: Add socket destroy capability #4848

kernel-patches-bot · 2023-03-30T15:33:17Z

Pull request for series with
subject: bpf: Add socket destroy capability
version: 5
url: https://patchwork.kernel.org/project/netdevbpf/list/?series=735459

Previously, BPF TCP iterator was acquiring fast version of sock lock that disables the BH. This introduced a circular dependency with code paths that later acquire sockets hash table bucket lock. Replace the fast version of sock lock with slow that faciliates BPF programs executed from the iterator to destroy TCP listening sockets using the bpf_sock_destroy kfunc (implemened in follow-up commits). Here is a stack trace that motivated this change: ``` 1) sock_lock with BH disabled + bucket lock lock_acquire+0xcd/0x330 _raw_spin_lock_bh+0x38/0x50 inet_unhash+0x96/0xd0 tcp_set_state+0x6a/0x210 tcp_abort+0x12b/0x230 bpf_prog_f4110fb1100e26b5_iter_tcp6_server+0xa3/0xaa bpf_iter_run_prog+0x1ff/0x340 bpf_iter_tcp_seq_show+0xca/0x190 bpf_seq_read+0x177/0x450 vfs_read+0xc6/0x300 ksys_read+0x69/0xf0 do_syscall_64+0x3c/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc 2) sock lock with BH enable [ 1.499968] lock_acquire+0xcd/0x330 [ 1.500316] _raw_spin_lock+0x33/0x40 [ 1.500670] sk_clone_lock+0x146/0x520 [ 1.501030] inet_csk_clone_lock+0x1b/0x110 [ 1.501433] tcp_create_openreq_child+0x22/0x3f0 [ 1.501873] tcp_v6_syn_recv_sock+0x96/0x940 [ 1.502284] tcp_check_req+0x137/0x660 [ 1.502646] tcp_v6_rcv+0xa63/0xe80 [ 1.502994] ip6_protocol_deliver_rcu+0x78/0x590 [ 1.503434] ip6_input_finish+0x72/0x140 [ 1.503818] __netif_receive_skb_one_core+0x63/0xa0 [ 1.504281] process_backlog+0x79/0x260 [ 1.504668] __napi_poll.constprop.0+0x27/0x170 [ 1.505104] net_rx_action+0x14a/0x2a0 [ 1.505469] __do_softirq+0x165/0x510 [ 1.505842] do_softirq+0xcd/0x100 [ 1.506172] __local_bh_enable_ip+0xcc/0xf0 [ 1.506588] ip6_finish_output2+0x2a8/0xb00 [ 1.506988] ip6_finish_output+0x274/0x510 [ 1.507377] ip6_xmit+0x319/0x9b0 [ 1.507726] inet6_csk_xmit+0x12b/0x2b0 [ 1.508096] __tcp_transmit_skb+0x549/0xc40 [ 1.508498] tcp_rcv_state_process+0x362/0x1180 ``` Signed-off-by: Aditi Ghag <aditi.ghag@isovalent.com>

This is a preparatory commit to remove the field. The field was previously shared between proc fs and BPF UDP socket iterators. As the follow-up commits will decouple the implementation for the iterators, remove the field. As for BPF socket iterator, filtering of sockets is exepected to be done in BPF programs. Suggested-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Aditi Ghag <aditi.ghag@isovalent.com>

This is a preparatory commit to refactor code that matches socket attributes in iterators to a helper function, and use it in the proc fs iterator. Signed-off-by: Aditi Ghag <aditi.ghag@isovalent.com>

Batch UDP sockets from BPF iterator that allows for overlapping locking semantics in BPF/kernel helpers executed in BPF programs. This facilitates BPF socket destroy kfunc (introduced by follow-up patches) to execute from BPF iterator programs. Previously, BPF iterators acquired the sock lock and sockets hash table bucket lock while executing BPF programs. This prevented BPF helpers that again acquire these locks to be executed from BPF iterators. With the batching approach, we acquire a bucket lock, batch all the bucket sockets, and then release the bucket lock. This enables BPF or kernel helpers to skip sock locking when invoked in the supported BPF contexts. The batching logic is similar to the logic implemented in TCP iterator: https://lore.kernel.org/bpf/20210701200613.1036157-1-kafai@fb.com/. Suggested-by: Martin KaFai Lau <martin.lau@kernel.org> Signed-off-by: Aditi Ghag <aditi.ghag@isovalent.com>

The socket destroy kfunc is used to forcefully terminate sockets from certain BPF contexts. We plan to use the capability in Cilium to force client sockets to reconnect when their remote load-balancing backends are deleted. The other use case is on-the-fly policy enforcement where existing socket connections prevented by policies need to be forcefully terminated. The helper allows terminating sockets that may or may not be actively sending traffic. The helper is currently exposed to certain BPF iterators where users can filter, and terminate selected sockets. Additionally, the helper can only be called from these BPF contexts that ensure socket locking in order to allow synchronous execution of destroy helpers that also acquire socket locks. The previous commit that batches UDP sockets during iteration facilitated a synchronous invocation of the destroy helper from BPF context by skipping taking socket locks in the destroy handler. TCP iterators already supported batching. The helper takes `sock_common` type argument, even though it expects, and casts them to a `sock` pointer. This enables the verifier to allow the sock_destroy kfunc to be called for TCP with `sock_common` and UDP with `sock` structs. As a comparison, BPF helpers enable this behavior with the `ARG_PTR_TO_BTF_ID_SOCK_COMMON` argument type. However, there is no such option available with the verifier logic that handles kfuncs where BTF types are inferred. Furthermore, as `sock_common` only has a subset of certain fields of `sock`, casting pointer to the latter type might not always be safe for certain sockets like request sockets, but these have a special handling in the diag_destroy handlers. Signed-off-by: Aditi Ghag <aditi.ghag@isovalent.com>

The helper will be used to programmatically retrieve, and pass ports in userspace and kernel selftest programs. Suggested-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Aditi Ghag <aditi.ghag@isovalent.com>

The test cases for destroying sockets mirror the intended usages of the bpf_sock_destroy kfunc using iterators. The destroy helpers set `ECONNABORTED` error code that we can validate in the test code with client sockets. But UDP sockets have an overriding error code from the disconnect called during abort, so the error code the validation is only done for TCP sockets. Signed-off-by: Aditi Ghag <aditi.ghag@isovalent.com>

kernel-patches-bot · 2023-03-30T15:33:18Z

Upstream branch: 4ca13d1
series: https://patchwork.kernel.org/project/netdevbpf/list/?series=735459
version: 5

kernel-patches-bot · 2023-03-30T17:46:54Z

At least one diff in series https://patchwork.kernel.org/project/netdevbpf/list/?series=735459 expired. Closing PR.

aditighag added 7 commits March 30, 2023 08:32

udp: seq_file: Helper function to match socket attributes

cff8cdf

This is a preparatory commit to refactor code that matches socket attributes in iterators to a helper function, and use it in the proc fs iterator. Signed-off-by: Aditi Ghag <aditi.ghag@isovalent.com>

selftests/bpf: Add helper to get port using getsockname

f86650f

The helper will be used to programmatically retrieve, and pass ports in userspace and kernel selftest programs. Suggested-by: Stanislav Fomichev <sdf@google.com> Signed-off-by: Aditi Ghag <aditi.ghag@isovalent.com>

kernel-patches-bot added new bpf-next V5 labels Mar 30, 2023

kernel-patches-bot added changes-requested and removed new labels Mar 30, 2023

kernel-patches-bot closed this Mar 30, 2023

kernel-patches-bot deleted the series/735459=>bpf-next branch April 2, 2023 00:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bpf: Add socket destroy capability #4848

bpf: Add socket destroy capability #4848

kernel-patches-bot commented Mar 30, 2023

kernel-patches-bot commented Mar 30, 2023

kernel-patches-bot commented Mar 30, 2023

bpf: Add socket destroy capability #4848

bpf: Add socket destroy capability #4848

Conversation

kernel-patches-bot commented Mar 30, 2023

kernel-patches-bot commented Mar 30, 2023

kernel-patches-bot commented Mar 30, 2023