socket.recvfrom crops UDP packets #3619

rosik · 2018-08-10T12:31:10Z

Here is a reproducer

#!/usr/bin/env tarantool

local log = require('log')
local socket = require('socket')
local port = 33001

sock = socket('AF_INET', 'SOCK_DGRAM', 'udp')
sock:bind('0.0.0.0', port)
local n_snd = sock:sendto('127.0.0.1', port, string.ljust('', 1025, 'x'))
log.info('Sent %d', n_snd)

local msg1, from1 = sock:recvfrom()
log.info('Recv %d from %s:%d', #msg1, from1.host, from1.port)

local msg2, from2 = sock:recvfrom()
log.info('Recv %s from %s', msg2, from2)

The output is

Sent 1025
Recv 512 from 127.0.0.1:33001
Recv nil from nil

The problem is that the message is being cropped by 512 bytes by default (https://github.com/tarantool/tarantool/blob/2.0/src/lua/socket.lua#L802)
And I can not receive large packets in multiple reads.

The text was updated successfully, but these errors were encountered:

When size parameter is not passed to socket:recv() or socket:recvfrom() it will call recv(fd, NULL, 0 , MSG_TRUNC | MSG_PEEK) on the socket to evaluate size of the buffer to store the receiving datagram. Before this commit a datagram will be cropped to 512 bytes in the case. Added socket:is_udp() and socket:dgram_length() methods to the public API. Part of #3619.

We set MSG_TRUNC for recv / recvfrom calls to receive the real length of the datagram. When the datagram length is larger then the buffer size we set the EMSGSIZE errno, but still return `from` table (in case of recvfrom). Fixes #3619.

When size parameter is not passed to socket:recv() or socket:recvfrom() it will call recv(fd, NULL, 0 , MSG_TRUNC | MSG_PEEK) on the socket to evaluate size of the buffer to store the receiving datagram. Before this commit a datagram will be cropped to 512 bytes in the case. Added socket:is_udp() and socket:dgram_length() methods to the public API. Part of #3619.

We set MSG_TRUNC for recv / recvfrom calls to receive the real length of the datagram. When the datagram length is larger then the buffer size we set the EMSGSIZE errno, but still return `from` table (in case of recvfrom). Fixes #3619.

Part of #3619.

When size parameter is not passed to socket:recv() or socket:recvfrom() it will call a) or b) on the socket to evaluate size of the buffer to store the receiving datagram. Before this commit a datagram will be truncated to 512 bytes in the case. a) Linux: recv(fd, NULL, 0 , MSG_TRUNC | MSG_PEEK) b) Mac OS: getsockopt(fd, SOL_SOCKET, SO_NREAD, &val, &len) It is recommended to set 'size' parameter (size of the input buffer) explicitly based on known message format and known network conditions (say, set it less then MTU to prevent IP fragmentation, which can be inefficient) or pass it from a configuration option of a library / an application. The reason is that explicit buffer size provided allows to avoid extra recv syscall on Linux. Added socket:is_udp() and socket:dgram_length() methods to the public API. Part of #3619.

When the datagram length is larger then the buffer size we discard the datagram, return nil as the message, set errno to EMSGSIZE, but still return `from` table (in case of recvfrom). The commit changes behaviour of socket:recv and socket:recvfrom methods in case of a UDP socket. They still truncate the input in case of a TCP socket (because of stream nature of TCP). On Linux we set MSG_TRUNC for recv / recvfrom calls to receive the real length of the datagram. On Mac OS we call getsockopt(fd, SOL_SOCKET, SO_NREAD, &val, &len) before the recv / recvfrom call. These extra steps are avoided for TCP sockets, because MSG_TRUNC leads to discarding the input (Linux) and to avoid extra getsockopt syscall (Mac OS). Fixes #3619.

When size parameter is not passed to socket:recv() or socket:recvfrom() it will call a) or b) on the socket to evaluate size of the buffer to store the receiving datagram. Before this commit a datagram will be truncated to 512 bytes in the case. a) Linux: recv(fd, NULL, 0 , MSG_TRUNC | MSG_PEEK) b) Mac OS: getsockopt(fd, SOL_SOCKET, SO_NREAD, &val, &len) It is recommended to set 'size' parameter (size of the input buffer) explicitly based on known message format and known network conditions (say, set it less then MTU to prevent IP fragmentation, which can be inefficient) or pass it from a configuration option of a library / an application. The reason is that explicit buffer size provided allows to avoid extra syscall to evaluate necessary buffer size. When 'size' parameter is set explicitly for recv / recvfrom on a UDP socket and the next datagram length is larger then the size, the returned message will be truncated to the size provided and the rest of the datagram will be discarded. Of course, the tail will not be discarded in case of a TCP socket and will be available to read by the next recv / recvfrom call. Fixes #3619.

Totktonada · 2018-08-27T15:42:54Z

Decided to:

evaluate the input buffer size (for UDP) when no explicit size provided;
truncate the datagram and discard the tail if it is larger then explicitly provided input buffer size (as we do now).

Cite from the Yaroslav's (@rosik) message:

I've tested the new behavior.
The recvfrom(nil) now works great.

However I doubt about returning nil instead of truncated message.
It breaks backward compatibility and it is not worth of it:
When a user writes recvfrom(32) I guess he is ready to get truncated message.
If he is not, it's easier to call new version of recvfrom(nil) than to check return value every time.

Totktonada · 2018-08-27T18:43:21Z

@TarantoolBot document
Title: socket:recv() / socket:recvfrom() updates

API change (for a UDP socket after socket.recvfrom crops UDP packets #3619): a datagram will not be truncated to 512 bytes and the input buffer size will be evaluated when no explicit size was passed (but the evaluating leads to extra syscall); check the commit message for more information.
API undocumented (for a UDP socket): a datagram will be truncated to size length in case when explicit size was passed, but the datagram length is larger then the size; the tail of the datagram will be discarded in the case; don't forget to note TCP behaviour (it is very different); check the commit message for more information.
Recommendation missed: recommended to set size explicitly for recv and recvfrom; don't forget to give recommendations whether it is possible to determine the size (message format, network conditions) and how to do so depending of network conditions (add note that datagram sizes above MTU will lead to IP fragmentation that can be inefficient; add note that there is tuneable sysctl net.inet.udp.maxdgram on Mac OS); check the commit message for more information.
Minor: update 'Use of a UDP socket on localhost' to use explicit size (say, 512), because we will recommend it for readers.
Minor: ToC sorting is such that recv is far away from recvfrom.
Monor: inconsistency in parameters naming: recv(size) vs recvfrom(limit). The size term should be preferred IMHO, because AFAIK the word 'size' usually applied for the size of a buffer (while 'length' usually applied to a message length).

Feel free to ask more informatiom from @Totktonada.

TarantoolBot · 2018-08-27T18:43:22Z

@Totktonada: Accept.

When size parameter is not passed to socket:recv() or socket:recvfrom() it will call a) or b) on the socket to evaluate size of the buffer to store the receiving datagram. Before this commit a datagram will be truncated to 512 bytes in the case. a) Linux: recv(fd, NULL, 0 , MSG_TRUNC | MSG_PEEK) b) Mac OS: getsockopt(fd, SOL_SOCKET, SO_NREAD, &val, &len) It is recommended to set 'size' parameter (size of the input buffer) explicitly based on known message format and known network conditions (say, set it less then MTU to prevent IP fragmentation, which can be inefficient) or pass it from a configuration option of a library / an application. The reason is that explicit buffer size provided allows to avoid extra syscall to evaluate necessary buffer size. When 'size' parameter is set explicitly for recv / recvfrom on a UDP socket and the next datagram length is larger then the size, the returned message will be truncated to the size provided and the rest of the datagram will be discarded. Of course, the tail will not be discarded in case of a TCP socket and will be available to read by the next recv / recvfrom call. Fixes #3619.

locker · 2018-08-30T10:24:53Z

Fixed by 11fb3ab

kyukhin added this to the 1.10.2 milestone Aug 10, 2018

kyukhin added bug Something isn't working lua labels Aug 10, 2018

Totktonada self-assigned this Aug 10, 2018

Totktonada added a commit that referenced this issue Aug 23, 2018

WIP: socket: fix dgram_length and recv* for OS X

cfcb979

Part of #3619.

kyukhin added the ready for review label Aug 29, 2018

locker closed this as completed Aug 30, 2018

TarantoolBot mentioned this issue Aug 30, 2018

socket:recv() / socket:recvfrom() updates tarantool/doc#606

Closed

locker removed the ready for review label Aug 30, 2018

kyukhin added the tmp label Oct 10, 2018

locker removed tmp labels Nov 28, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

socket.recvfrom crops UDP packets #3619

socket.recvfrom crops UDP packets #3619

rosik commented Aug 10, 2018

Totktonada commented Aug 27, 2018

Totktonada commented Aug 27, 2018

TarantoolBot commented Aug 27, 2018

locker commented Aug 30, 2018

socket.recvfrom crops UDP packets #3619

socket.recvfrom crops UDP packets #3619

Comments

rosik commented Aug 10, 2018

Totktonada commented Aug 27, 2018

Totktonada commented Aug 27, 2018

TarantoolBot commented Aug 27, 2018

locker commented Aug 30, 2018