feat(net): PhyriadNet/1 pillar — Phases 0-6 complete#8
Merged
Conversation
The `last_block` local in hash_long() is reserved for the final block-stripe finalize step that will land with full SecretGen support. Currently unused — benign until you combine -Werror with a TU that instantiates schema_hash<T>() (e.g. via PodMessage.hpp's static_assert on SampleTick). The net pillar (next commit) is the first TU to hit that combination. Mark the local [[maybe_unused]] to preserve the documentation intent without tripping -Werror. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New framework pillar exposing the Phyriad pool runtime over UDP via a
custom 16-byte-header protocol (PhyriadNet/1). Six implementation phases
landed in this commit:
Phase 0 — Protocol foundation
framework/net/include/phyriad/net/PN1Frame.hpp
framework/net/include/phyriad/net/PN1Codec.hpp
framework/net/include/phyriad/net/Xxhash32.hpp
framework/net/src/Net.cpp (out-of-line compute_checksum)
framework/net/tests/test_pn1_codec.cpp (520 checks)
Phase 1 — Reliability layer
framework/net/include/phyriad/net/PendingAck.hpp
framework/net/include/phyriad/net/RetransmitQueue.hpp
framework/net/include/phyriad/net/PN1Session.hpp
framework/net/tests/test_pn1_session.cpp (103 checks)
Backoff schedule: 50/100/200/400/800 ms (conservative for Windows
multimedia-timer-clamped runners; LAN deployments can override).
Phase 2 — I/O backend
framework/net/include/phyriad/net/IoBackend.hpp
framework/net/src/IoBackend_posix.cpp (POSIX sockets + WinSock2)
framework/net/tests/test_io_backend.cpp (43 checks)
Includes multicast support (IP_ADD_MEMBERSHIP / IP_MULTICAST_TTL /
IP_MULTICAST_LOOP) for the Phase 4 pheromone path.
Phase 3 — NetGateway (server side)
framework/net/include/phyriad/net/NetGateway.hpp
Session table is heap-allocated transparently at construction so
declaring a NetGateway on the stack doesn't blow Windows' default
1 MB stack (default config: 64 sessions × 16 frame caches × 4 KiB).
Peer-validates every non-SESSION_INIT frame against recorded src
endpoint to drop spoofed / stale packets cleanly.
Phase 4 — NetPheromone (multicast stigmergy sync)
framework/net/include/phyriad/net/NetPheromone.hpp
framework/net/tests/test_net_pheromone.cpp (24 checks)
Latest-wins T-up-to-8-bytes slot array, UDP-multicast-synced.
Phase 5 — Client library (two variants)
framework/net/include/phyriad/net/NetClient.hpp (C++23, full features)
framework/netclient/include/phyriad/netclient/NetClient.hpp
(C++17, zero
framework deps,
single header)
framework/net/tests/test_net_e2e.cpp (33 checks)
framework/netclient/tests/test_netclient_thin.cpp (25 checks)
Phase 6 — Hardening + benchmarks
bench/bench_pn1_codec.cpp — xxHash32 10.4 GB/s, encode/decode 2.27 GB/s
bench/bench_net_rtt.cpp — loopback RTT distribution
bench/bench_net_throughput.cpp — sustained tasks/s
bench/comparisons/bench_pn1_vs_raw_udp.cpp — 98.7% of raw-UDP throughput
bench/comparisons/bench_pn1_vs_tcp_echo.cpp — 1.84x TCP throughput
bench/comparisons/bench_pn1_vs_grpc.cpp — gated on libgrpc++-dev
docs/framework/PERF_BASELINE.json updated with net_metrics_reference
Ergonomic helpers (DX-driven, see docs/internal/NET_PILLAR_DX_COMPARISON.md):
Endpoint::any(port_le)
Endpoint::localhost(port_le)
Endpoint::ipv4(a,b,c,d, port_le)
Endpoint::port() // host-byte-order readback
NetGateway::run(std::stop_token)
NetGateway::run_blocking()
WIP gate: root CMakeLists.txt foreach already includes `net` and now
also `netclient` — the gates skip cleanly when the directories aren't
present (verified via portability self-check by cloning to C:/temp).
Documentation:
README.md — new "Optional — Network dispatch" section with
honest benchmark results (codec micro-bench
numbers are CPU-bound + stable; loopback RTT
on Windows is timer-clamped; Δ vs alternatives
is the meaningful comparison)
docs/QUICKSTART_NET.md — 5-minute runnable walkthrough
docs/LLM_INTEGRATION_GUIDE.md — §5.5 NetGateway+NetClient quickstart
for AI agents; net-pillar gotchas in §6
examples/quickstart_net/{server,client}.cpp — 18+9 LOC runnable demo
Total: 77/77 project tests green; 748 new test checks across 6 net-pillar
test binaries; full bench suite compiles + runs on Windows MinGW.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The previous commit landed 22 raw std::memory_order_* uses across the
net pillar + bench + examples. The official scripts/lint_hal.sh enforces
that these be replaced with the hal:: wrappers everywhere outside
framework/hal/. The lint was silently passing on my local box due to
a grep -P locale bug (Git-Bash without UTF-8 locale returns exit 2,
suppressed by `2>/dev/null`); on Linux CI it would have failed loud.
Categorised fixes:
• Thread stop flags in tests, benches, and examples → ctrl_*_acquire /
ctrl_*_release (signal semantics — driver-thread shutdown handshake).
• NetPheromone slot store/load → stat_*_relaxed (slot values do NOT
synchronise with any sibling payload — same intentional relaxed
semantics as the in-process Pheromone<T,N>).
• IoBackend_posix WSAStartup refcount → kept raw acq_rel with the
documented `// HAL: acq_rel refcount …` justification on the same
line (per the documented escape in MemoryOrder.hpp; the refcount
needs acq_rel to publish WSAStartup completion to peers and to
observe all prior socket use before WSACleanup).
Verification:
LC_ALL=C.UTF-8 bash scripts/lint_hal.sh
→ 432 files inspected, 0 violations
77/77 project tests still green after the wrapper rewrite — the HAL
wrappers are zero-overhead inline calls so behaviour is unchanged.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… maybe-uninit Three failures detected on PR #8 CI matrix; all are warning-as-error escalations that only fire under the specific toolchains they fail on: • TSan (gcc 13, -fsanitize=thread): test_io_backend.cpp:202-203 — `pn::PN1Header hdr; const uint8_t* pp;` uninitialized when the lambda body is inlined into the std::function invoker. gcc -O1+ + TSan's interceptor instrumentation makes the no-throw decode() path that DOES initialize them invisible to the flow-sensitive maybe-uninit pass. Fixed with `{}` and `= nullptr` explicit initializers (zero-cost — decode() overwrites them). • clang-18 + libc++ (-Werror -Wunused-result): test_io_backend.cpp:144,175,186 — IoBackend::open() is [[nodiscard]] by design (callers should branch on the status), but two test setup sites called it for side-effects only. Fixed with `(void)` casts. • MSVC /WX (C4127 conditional expression is constant): test_pn1_codec.cpp + every other net test — the standard `do { … } while (false)` test-macro idiom trips MSVC's strict constant-conditional warning. Standard portable fix is the GoogleTest-style `while ((void)0,0)` — the comma-expression with a void cast suppresses constant-folding while preserving the do-once semantics that statement-expansion macros need. Local verification: net pillar: 6/6 tests pass project: 77/77 tests pass These are the precise three checks that were red on PR #8; expecting all four checks (incl. lint-hal which already turned green on the previous commit) green after this push. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Round 2 of the CI portability fixes (round 1 was 5190f8f). Two checks still failed because the earlier fixes covered only the obvious sites: • TSan (gcc 13): same maybe-uninitialized pattern as before but in test_pn1_codec.cpp instead of test_io_backend.cpp. Six sites of `pn::PN1Header hdr; const uint8_t* pp; size_t fsize;` declared without explicit init — value-initialize all of them so the flow-sensitive pass sees a definite assignment before the lambda invocation, no matter what gcc -O1 + TSan instrumentation does to the std::function dispatch chain. • MSVC /WX (C4127 conditional expression is constant): the `while ((void)0,0)` fix from 5190f8f handled the while-side of the macro, but MSVC also fires C4127 on the `if (_aa == _bb)` inside EXPECT_EQ when BOTH operands are compile-time constants (`EXPECT_EQ(sizeof(pn::PN1Header), 16u)` and friends). The standard portable fix is a file-level `#pragma warning(disable: 4127)` — GoogleTest and Catch2 do the exact same thing for the same reason. Applied to all six net+netclient test files. Local verification: 6/6 net pillar tests pass post-fix. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CI on PR #8 surfaced two issues round-2 didn't catch: 1. TSan data race (gcc 13 -fsanitize=thread): ServerSession::active was a plain bool, written by the driver thread (handle_session_close / handle_session_init / abort in tick()) and read by the user thread (active_session_count() called from main in the integration tests). This is a real race, not a false positive — the test_disconnect path explicitly reads the flag from main right after the driver finishes processing SESSION_CLOSE. Fixed by making `active` a std::atomic<bool> with ctrl_store_release / ctrl_load_acquire wrappers on every site (9 read/write spots in NetGateway.hpp). The store-release on `active = true` also publishes the freshly-initialized peer / sess / retx_cache fields to any future reader that observes active via ctrl_load_acquire — fixes a second latent visibility race in allocate_session(). 2. MSVC C4267 (size_t -> uint32_t in test_pn1_codec.cpp:373): the misaligned-input test loop used `offset` (size_t) directly as the session_id parameter of encode(). Explicit static_cast<uint32_t>. Local verification: 6/6 net pillar tests pass after the change. Single-threaded local Windows can't reproduce the TSan race; the fix is structural and matches the existing HAL discipline for publisher/subscriber atomic flags (same pattern phyriad_pool uses for worker.active state). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
4 tasks
bench_pn1_codec.cpp:34 used the GCC/Clang idiom
asm volatile("" :: "r"(x) : "memory")
inside an `[[gnu::noinline]]` template. MSVC has neither feature, so
the windows-msvc-Release CI job failed with C2760 (unexpected
'volatile') + C3878.
Switched to the standard portable trick: write the value to a local
`volatile T sink`. The compiler cannot prove the write is dead, so it
must materialize the value (same observable effect as the asm clobber,
just without the asm). Verified bench results unchanged:
xxhash32(4 KiB) 10.37 GB/s (was 10.41 GB/s)
encode(4 KiB) full path 2.27 GB/s (same)
decode(4 KiB) full path 2.30 GB/s (same)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
New framework pillar (
phyriad_net) exposing the Phyriad pool runtime overUDP via a custom 16-byte-header protocol (PhyriadNet/1). Plus a thin C++17
client (
phyriad_netclient) for resource-constrained edges.Six implementation phases landed in one PR:
PN1Frame,PN1Codec, runtimexxHash32(520 checks)RetransmitQueue+PN1Sessionstate machine + sliding-window dedup (103 checks)NetGateway: session table, peer validation, anti-spoofingNetPheromone<T,N>: latest-wins stigmergy sync over UDP multicast (24 checks)NetClient(33 checks) + thin C++17 single-header inframework/netclient/(25 checks)Total: 77/77 project tests green; 748 new test checks across 6 net-pillar test binaries.
Honest benchmark results
Codec micro-bench (gcc 15.2 + Release+LTO, 7950X3D — reproducible, CPU-bound):
PN1Codec::encode(4 KiB frame)PN1Codec::decode(4 KiB + checksum verify)Head-to-head vs alternatives (32-byte payload, 5 000 iters, loopback):
Windows numbers are clamped by the 15.6 ms multimedia timer — both raw UDP and PN1 hit the same clamp. The meaningful comparison is Δ vs raw UDP at p50 ≈ +200 ns (codec + framing + checksum overhead) and 98.7 % of raw UDP throughput. Linux CI numbers will be uniformly tighter.
DX
examples/quickstart_net/)Endpoint::any(9742)/Endpoint::localhost(9742)/Endpoint::ipv4(192,168,1,5, 9742)— nohtonsboilerplategw.run(std::stop_token)— drive the gateway with one linedocs/internal/NET_PILLAR_DX_COMPARISON.md(internal)Documentation
README.md— new "Optional — Network dispatch" section with honest benchmarksdocs/QUICKSTART_NET.md— 5-minute runnable walkthroughdocs/LLM_INTEGRATION_GUIDE.md§5.5 — AI-agent-friendly quickstart + gotchasTest plan
ctest --exclude benchfrombuild/)C:/tempand build from scratch passesnetandnetclient🤖 Generated with Claude Code