Avoid kernel_recvmsg() #21

fridex · 2016-04-23T12:15:47Z

Current implementation uses kernel_recvmsg() for receiving records. This function does copy from skbuff to passed vector (see [1] for TCP, see [2] for UDP), so it would be nice to avoid it.

When underlying protocol is TCP, there can be used tcp_read_sock(). Unfortunately the implementation of tcp_read_sock() does not support peeking (see [3]), which is necessary according to current AF_KTLS design.
When underlying protocol is UDP, there is currently no such copy-less logic that could be reused (AFAIK).

EDIT: we could consider to operate directly on skbuff

[1] http://lxr.free-electrons.com/source/net/ipv4/tcp.c#L1830
[2] http://lxr.free-electrons.com/source/net/ipv4/udp.c#L1392
[3] http://lxr.free-electrons.com/source/net/ipv4/tcp.c#L1485

The text was updated successfully, but these errors were encountered:

fridex · 2016-04-25T13:01:24Z

I think the best approach would be to:

extend tcp_read_sock() with MSG_PEEK flag
introduce udp_read_sock() with MSG_PEEK support for UDP

Using directly skbuffs is not nice, since there should be appropriate operations on UDP/TCP sockets to encapsulate such logic (and make it possible to reuse these operations in other parts of the kernel).

fridex · 2016-05-09T07:32:40Z

When run "splice echo time" scenario for 2 seconds a simple ping-pong with server [1]:

splice(ksd, NULL, pipe, NULL, 1400, 0);
splice(pie, NULL, ksd, NULL, 1400, 0);

With MTU 1400:

I am getting following results:

44.24% of total time spent in kernel_sendmsg()
- 38.28% of total time spent in tcp_push - on actual sending
- 1.15% of total time spent in allocation socket buffers skb_stream_alloc_skb
- cca 2% on copy from kernel vector (copy_from_iter, memcpy_erms)
33.14% of total time spent in tls_splice_read
- 13.14% of total time spent in kernel_recvmsg
- cca 2% on copy and allocation (skb_copy_datagram_iter, copy_page_to_iter)

With MTU 16000:

I am getting following results:

22.29% of total time spent in kernel_sendmsg()
- 16.30% of total time spent in tcp_push - on actual sending
- 0.69% of total time spent in allocation socket buffers skb_stream_alloc_skb
- 3.03% on copy from kernel vector (copy_from_iter, memcpy_erms)
42.25% of total time spent in tls_splice_read
- 9.02% of total time spent in kernel_recvmsg
- 4.02 % on copy and allocation (skb_copy_datagram_iter, copy_page_to_iter)

Ideally we could save:

for 1400 MTU:
- cca 2% by avoiding kernel_recvmsg()
- cca 3.15% by avoiding kernel_sendmsg()
for 16000 MTU:
- 3.72% by avoiding kernel_sendmsg()
- 4.02% by avoiding kernel_recvmsg()

We have to consider addional logic within kernel_sendmsg() and kernel_recvmsg() (locking, ...). Using kernel_sendpage() and tcp_read_sock() (udp_read_sock()) can have different logic which could have positive/negative impact as well.

perf reporting context switches not expensive at all (0.30% of total)

related: https://github.com/fridex/af_ktls/issues/22

[1] https://github.com/fridex/af_ktls-tool/blob/master/action.c#L795

djwatson · 2016-07-28T17:00:09Z

fixed by #62

fridex added the enhancement label Apr 23, 2016

fridex mentioned this issue Apr 25, 2016

Documentation: no parallel socket operations on AF_KTLS and bound socket are possible without explicit synchronization #25

Closed

fridex mentioned this issue May 18, 2016

Do MSG_PEEK call only once #37

Closed

fridex mentioned this issue Jun 15, 2016

support poll #33

Closed

djwatson closed this as completed Jul 28, 2016

fridex mentioned this issue Aug 10, 2016

peek tcp data using tcp_read_sock #74

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid kernel_recvmsg() #21

Avoid kernel_recvmsg() #21

fridex commented Apr 23, 2016 •

edited

Loading

fridex commented Apr 25, 2016

fridex commented May 9, 2016 •

edited

Loading

djwatson commented Jul 28, 2016

Avoid kernel_recvmsg() #21

Avoid kernel_recvmsg() #21

Comments

fridex commented Apr 23, 2016 • edited Loading

fridex commented Apr 25, 2016

fridex commented May 9, 2016 • edited Loading

With MTU 1400:

With MTU 16000:

djwatson commented Jul 28, 2016

fridex commented Apr 23, 2016 •

edited

Loading

fridex commented May 9, 2016 •

edited

Loading