Skip to content

v1.1.0

Choose a tag to compare

@benoitc benoitc released this 18 Apr 07:50
· 176 commits to main since this release
b1abf60

Server-side throughput release. Per-connection send batching over
the shared listener socket on Linux + socket backend coalesces
outgoing packets into sendmsg super-datagrams via UDP_SEGMENT
(GSO); on macOS / gen_udp it is functionally neutral. Several
GSO correctness fixes after CI surfaced a handshake stall. Extra
observability so tests and operators can see the batching win
directly.

Added

  • server_send_batching option on quic:start_server/3 (default true). Set false to fall back to direct gen_udp:send/4.
  • quic_socket:info/1 — map with backend, gso_supported, gso_size, gro_enabled, batching_enabled, max_batch_packets, batch_flushes, packets_coalesced.
  • quic_socket:send_immediate/4 — public wrapper that bypasses the per-connection batch for one-shot control-plane sends.
  • quic_socket:new_sender/2 — per-connection sender that inherits backend + GSO capability from the listener without owning the socket.
  • quic_connection:get_stats/1 now returns batch_flushes and packets_coalesced.
  • quic_server_batching_SUITE behaviour-level regression: real 256 KB server-to-client downloads assert the counters.
  • docker/gso-debug/ reproducer container (Erlang 28 + tcpdump + strace).
  • Download-bench driver (bench/run_download_bench.erl) and quic_throughput_bench:run_download_sink/0,1.

Changed

  • Stream send path is iovec-native (quic_frame:encode_iodata/1), threads iodata through header protection and AEAD without copying.
  • 1-RTT ACKs delayed to every 2nd packet or max_ack_delay per RFC 9002 §6.2. +24% upload throughput on macOS gen_udp.
  • quic_loss switched to a queue:queue(#sent_packet{}). Per-ACK work scales with the ACK window; +7% upload, +47% download on macOS gen_udp.
  • flush_gso/1 passes the batch as an iov list to socket:sendmsg/2 with the UDP_SEGMENT cmsg (saves ~76 KB user-space copy per 64-packet flush on Linux).
  • send_app_packet_internal/3 samples monotonic_time once per packet and reuses it for loss tracking + activity.
  • Per-packet overhead reductions on bulk-send: single #state{} update, PTO reschedule skipped within tolerance, queue short-circuits, stream data normalised once.

Fixed

  • Server connection crashed with function_clause on socket_backend => socket because inet:sockname/1 rejects {'$socket', Ref} handles.
  • UDP_SEGMENT setsockopt now uses sizeof(int) (32-bit native); Linux was rejecting the previous u16 with EINVAL, silently disabling GSO. (#67)
  • GSO skipped for single-packet batches (ubuntu-24.04 dropped sub-gso_size single-packet payloads).
  • Listener no longer sets UDP_SEGMENT at socket level; GSO applied only via the per-message cmsg.
  • GSO bypassed when a batch mixes packet sizes (Initial + Handshake) so the client no longer stalls at awaiting_encrypted_extensions. (#75)
  • Listener self-send path switched to send_immediate/4 — VN / retry / stateless-reset packets were being buffered then lost with batching_enabled=true.
  • send_queue_bytes leak on ACK-coalesce dequeue; added send_queue_count as an O(1) empty predicate so zero-byte FIN-only sends aren't stranded under pacing.
  • examples/echo_server.erl and examples/qlog_example.erl restored. (#65, #68)

Verification

  • rebar3 eunit — 1937/1937 passing
  • rebar3 proper — 75/75 properties passing
  • rebar3 dialyzer / rebar3 xref / rebar3 lint / rebar3 fmt --check — clean
  • rebar3 ct --suite=quic_e2e_SUITE — 10/10
  • rebar3 ct --suite=quic_h3_e2e_SUITE — 12/12
  • rebar3 ct --suite=quic_server_batching_SUITE — 3/3 (+ GSO case, passes on ubuntu-24.04 with QUIC_ENABLE_GSO_TEST=1)
  • rebar3 ct --suite=quic_interop_SUITE — 15/15 (aioquic + quic-go)
  • docker/dist/scripts/run_test.sh 2 — 2-node cluster mesh + 1 MB RPC transfer PASS