Minor release: first-class UDP load balancing. Additive — the new protocol = "udp" listener type is opt-in, and the new proto messages (UdpListenerConfig, RequestUdpFrontend), configuration keys, and metrics all carry safe defaults, so existing configurations and sozu-command-lib consumers can bump from ^2.0.2 to ^2.1.0 without code changes.
✨ Added
feat(udp): first-class UDP listeners with load balancing (#1274; RFC #1273, closes #654).
A newprotocol = "udp"listener type sits alongsidetcp/http/https, with full hot-reconfig and CLI parity, fronting datagram services (DNS, syslog, NTP, generic UDP). The userland datapath runs in the existing single-threaded mio loop with static per-listener worker ownership (scale by running multiple listeners) and virtual 4-tuple flow sessions torn down by a three-knob policy (responses/ idle timeout /requests). Highlights:- Load balancing: two source-hash algorithms selectable per cluster via the shared
load_balancingkey —HRW(Highest-Random-Weight / rendezvous hashing, recommended UDP default, no rebuild stall on reconfig) andMAGLEV(opt-inO(1)lookup table for large backend sets), plus the existingROUND_ROBIN. Flow affinity keys on the client source IP (SOURCE_IP) or full 2-tuple (SOURCE_IP_PORT). - PROXY protocol v2 to backend (
send_proxy_protocol, first datagram by default;proxy_protocol_every_datagramopt-in), carrying the real client address. - Active health checks (
[clusters.<id>.udp.health]): companionTCP_PROBE(primary) or app-levelUDP_PROBE, with rise/fall hysteresis and fail-open. - Metrics:
udp.datagrams.{in,out},udp.bytes.{in,out},udp.active_flows(gauge),udp.flows.{created,evicted,shed},udp.datagrams.dropped(by reason),udp.backend.health,udp.flow.duration. - Config:
[[listeners]]gainsmax_rx_datagram_size(capped at the globalbuffer_size; oversized datagrams truncate and drop) andmax_flows(0= auto, ~70% of the softRLIMIT_NOFILE). Inbound UDP PROXY-protocol decode (expect_proxy) is not supported. Plaintext only; in-flight flows reset on hot-upgrade (listener fd handed off, flow state not migrated). Seedoc/configure.mdfor the full schema. - Testing: a FoundationDB/VOPR-style deterministic simulation harness (
lib/tests/udp_simulation.rs) drives the pure sans-io UDP core through a seeded, adversarial workload (weighted action grammar + FoundationDB-buggifyfault injection + a 256-seed sweep), layered on densedebug_assertinvariants (TigerBeetle TigerStyle) inmanager.rs/flow.rs; failures reproduce from a printed seed viaSOZU_UDP_SIM_SEED. Seedoc/testing.mdanddoc/udp_simulation.md.
- Load balancing: two source-hash algorithms selectable per cluster via the shared
🐛 Fixed
fix(state): hot-reconfig no longer silently deactivates a listener whose configuration changed while staying active.ConfigState::diff()replayed an in-place listener configuration change as Remove+Add without re-emittingActivateListenerwhen the target state kept the listener active, so a state replay (hot upgrade,LoadState) silently deactivated it. Fixed for all four listener types (HTTP/HTTPS/TCP/UDP); thediff()post-condition now asserts listener-map convergence, includingudp_listeners.fix(router): pattern-trie arithmetic-underflow panic on regex-leading hostnames. A frontend hostname whose leftmost segment is a regex (e.g./test[0-9]/.example.com) underflowedpos - 1tousize::MAXand panicked on re-insert (the dedup loop) and on anylookup_mut. Both paths now special-casepos == 0(the regex subtree is already a value-bearing leaf); regression test added.fix(ci/release): pin the cosign binary to the 2.x line.sigstore/cosign-installerfloated the cosign binary to v3, whosecosign sign-blobdefaults to--new-bundle-formatand ignores--output-signature/--output-certificate, so the2.0.2release run failed at the signing step (create bundle file: open : no such file or directory) and the draft GitHub release + Docker push were skipped. Pincosign-release: v2.6.3(latest 2.x, ≥ the documented 2.4.1 verify baseline) so keyless signing keeps emitting theSHA256SUMS.sig+SHA256SUMS.pemside files. The 2.1.0 tag is the first to exercise the fix. Migrating verification to the v3--bundleformat is tracked separately.
🔄 Changed
- TigerStyle assertion-density sweep across
lib+command+bin: ~1100debug_assert!s + 16 privatecheck_invariants()full-sweeps across the H2/H1 mux, all protocols, the data plane, the server event loop, the command plane (channel/SCM/state/config/request/response/cert), and the bin supervisor + master command server. All compiled out in release builds — no production behavior change; they run live in every test/e2e/fuzz/dev build, turning silent correctness bugs into loud crashes. Doctrine indoc/testing.md.
🤖 CI
- The nightly fuzz job now also runs
fuzz_udp_flowalongsidefuzz_frame_parserandfuzz_hpack_decoder; a new test-hygiene job rejects bare#[ignore]attributes (a reason string is required); and a scheduledsimulation-sweep.ymlruns a daily FoundationDB-style seed swarm over the deterministic UDP simulator (debug build, TigerStyle invariants live) plus extended fuzzing of all three targets.
📚 Documentation
doc/testing.md— the authoritative testing doctrine (assertion-first TigerBeetle TigerStyle + FoundationDB-style deterministic simulation, the five test categories, how to run each), with the contributor checklist inCONTRIBUTING.md#testing.lib/src/protocol/udp/LIFECYCLE.md— canonical internals reference for the UDP load-balancing datapath, matching the existing mux/kawa_h1/proxy_protocol LIFECYCLE siblings.doc/configure.md/bin/config.toml— the cluster load-balancing key isload_balancing(notload_balancing_policy), now documenting all six algorithms:ROUND_ROBIN,RANDOM,LEAST_LOADED,POWER_OF_TWO,HRW,MAGLEV.