-
Notifications
You must be signed in to change notification settings - Fork 3
Sockets
TCP/UDP connection is identified by so called 5-tuple:
(protocol, source address, source port, destination address, destination port)
Protocol is set when socket()
is called
Source address and source port is set when bind()
is called
Destination address and destination prot is set when connect()
is called (even for UDP)
In order to bind to any address, user can specify address 0.0.0.0
or ::
. In practice it means: all IP addresses of all local interfaces. During the succeeding connect()
call the OS will choose proper source IP based on destination address and contents of routing table.
In order to bind to any ephemeral port, user can specify port = 0
, then the OS will choose the port
By default no two sockets can be bound to same (source address, source port)
, e.g., if any socket is bound to 0.0.0.0:21
, then no other address can be bound to port 21
syscall | info |
---|---|
socket(domain, type, protocol) | returns socket file descriptor (fd) |
bind(fd, *addr, addrlen) | bind to address, returns error code |
listen(fd, backlog) | mark socket as passive (this is: as a socket accepting connections), backlog determines the maximum length to which the queue of pending connections for fd may grow |
send() | adds the data to the send buffer, it doesn't necessarily mean that the data has been sent out. In case of UDP it is possible that the data is send immediately, but for TCP it is unlikely |
Socket behavior can be changed, moreover options meaning differ with every OS
Enabled prior to binding.
Theoretically SO_REUSEADDR
has effect only on wildcard addresses and affects possibility of binding to 'taken' address.
SO_REUSEADDR | socketA | socketB | result (of socketB bind() ) |
---|---|---|---|
true/false | 192.168.1.1:21 |
192.168.1.1:21 |
EADDRINUSE |
true/false | 192.168.1.1:21 |
10.0.0.1:21 |
OK |
false | 0.0.0.0:21 |
192.168.1.1:21 |
EADDRINUSE |
true | 0.0.0.0:21 |
192.168.1.1:21 |
OK |
true/false | 0.0.0.0:21 |
0.0.0.0:21 |
EADDRINUSE |
Impact:
- For TCP sockets: the TCP socket after
close()
call, finally transitions to theTIME_WAIT
(waiting for the data in socket buffers to be send). The amount of time the socket stays in this state is determined by linger time (OS and socket level -SO_LINGER
- configurable option). IfSO_REUSEADDR
is not set for such socket then closed TCP socket is still considered bound (up to linger time). However ifSO_REUSEADDR
is set, it is possible to bind to such (not-yet-fully) closed socket (without waiting linger time) - For TCP sockets: it is possible that socket reusing address connects to the same destination address and port. Thus the 5-tuple is duplicated and
connect()
will fail withEADDRINUSE
- For UDP sockets: multiple multicast (one-to-many) sockets with
SO_REUSEADDR
may be bound to same combination of multicast address and port (same behavior likeSO_REUSEPORT
)
Allows arbitrary number of sockets to bind to exactly the same source address and port as long as all prior bound sockets also had SO_REUSEPORT
set.
For Linux (>= 3.9): all sockets that want to share the same combination of address and port must belong to processes that share the same effective user ID.
Equal to calling bind
for address "0.0.0.0" (sin_addr.s_addr
), which specifies all available interfaces.
Backlog states how many connections can be queued before handing over (via accept call) to the application. TCP three-way handshake has SYN RECEIVED (SYN received and SYN+ACK sent) intermediate state, which transits to ESTABLISHED after receiving the ACK from client. Thus backlog can be implemented in two ways:
- The implementation uses a single queue, the size of which is determined by the backlog argument of the listen syscall. When a SYN packet is received, it sends back a SYN/ACK packet and adds the connection to the queue. When the corresponding ACK is received, the connection changes its state to ESTABLISHED and becomes eligible for handover to the application. This means that the queue can contain connections in two different states: SYN RECEIVED and ESTABLISHED. Only connections in the latter state can be returned to the application by the accept syscall. In this approach if the queue is full then any consecutive three-way handshake will be unsuccessful (client syn will be dropped). Not used in linux
- The implementation uses two queues, a SYN queue (or incomplete connection queue) and an accept queue (or complete connection queue). Connections in state SYN RECEIVED are added to the SYN queue and later moved to the accept queue when their state changes to ESTABLISHED, i.e. when the ACK packet in the 3-way handshake is received. As the name implies, the accept call is then implemented simply to consume connections from the accept queue. In this case, the backlog argument of the listen syscall determines the size of the accept queue (second queue). Linux default.
To set the max length of incomplete queue (first queue) use:/proc/sys/net/ipv4/tcp_max_syn_backlog
. What happens when accept queue is full and client ACK arrives? Depends on/proc/sys/net/ipv4/tcp_abort_on_overflow
it may clean the incomplete connection (drop)
- General
- OS
- Networks
- Configuration
- Protocols
- Link layer
- Sockets
- Routing
- Tunneling
- Debugging
- LoRa
- Virtualization
- Infrastructure as a code
- Desktop environments
- Monitoring
- Benchmarking