π Net v0.27.6 β "Purple Rain"
A bindings & integration bug-fix release. A full-workspace bug hunt across the net crate (~100k LOC Rust) plus the Go / Python / FFI binding layers landed 34 of 37 findings across 47 commits and three review rounds. Every concrete first-pass bug lived at the language-binding / FFI edge β including three use-after-free races in the shipped Go module that are reachable by ordinary context cancellation.
No wire-format change, no C-ABI change, no public API change. Drop-in against honest v0.27.5 and earlier peers. Full audit: docs/misc/BUG_AUDIT_2026_06_18_BINDINGS.md.
Important
Go module users should upgrade promptly. The three use-after-free races are reachable on routine ctx cancellation of a streaming call β not an exotic double-close.
Highlights
- Fixed 3 use-after-free races in the shipped Go module (
RpcStream,MeshOsDaemonHandle,MeshBlobAdapter). - Added FFI panic guards to
rpc-ffiandcompute-ffiβ no more unwinding across the C ABI (UB). - Fixed a reliable-stream sequence gap under scheduler backpressure (permanent gap + duplicate re-send on the documented bulk-load path).
- ~25 additional MEDIUM/LOW correctness fixes across meshdb joins, the load balancer, the aggregator, deck streams, nRPC, and RedEX.
- Rust changes are
cargo check+clippyclean with per-finding regression tests; the core, identity/security, and RedEX recovery passes came back clean and verified.
High-severity fixes
- Go use-after-free Γ3.
RpcStream,MeshOsDaemonHandle, andMeshBlobAdapterguarded their native handle with check-then-use (a bareatomic.Bool, or a mutex dropped before the cgo call) instead of claim-then-use. A concurrent free β ctx-cancel watcher,Free()/Close(), or the GC finalizer β coulddrop(Box::from_raw(...))the native object while aRecv/Send/Store/NextControlwas parked inblock_onβ memory corruption. Fixed with a refcount quiesce guard (streamHandleGuard): ops bracket the cgo call withenter()/leave()without holding a lock, and the free runs once after the last op leaves and never blocks. Review rounds also covered the post-Split()halves and a separatePublishLogGCKeepAlivebug. - FFI panic guards (
rpc-ffi,compute-ffi). Neither crate had acatch_unwindat any entry point and both called tokio's rawRuntime::block_on, which panics on runtime re-entry β the unwind crossingextern "C"is undefined behavior. All entry points are now wrapped inffi_guard!andblock_onroutes through the abort-on-reentry wrapper. (A review round caught the macro being defined but never invoked incompute-ffi, leaving all 80 entry points unguarded, plus adaemon_countpanic-default that collided with a valid result.) - Reliable-stream sequence gap under backpressure.
send_on_streamconsumed a sequence number atomically with byte credit, but a fullFairSchedulerqueue surfacedBackpressureafter the seq was taken β credit was refunded, the seq was not, and the retransmit descriptor was never registered. Result on a reliable stream: a permanent gap the receiver NACKs forever, plus duplicate re-send of already-committed batches on retry. Fixed by making the seq refundable and not replaying committed events; a follow-up bounded the committed-prefix retry (COMMITTED_FLUSH_STALL_BUDGET, 30 s) so a stalled receiver can't spin the sender forever.
Other fixes (MEDIUM / LOW)
- meshdb executor β LEFT/RIGHT OUTER join (and
sort_merge_join) silently dropped preserved-side rows with a missing/non-scalar join key; now emitted unmatched, matching full-outer. Federated query reported a fully-delivered result as failed on a lost trailingEndframe β sender now always emits afinal = trueterminal batch. - load balancer β half-open circuit probe slot could be permanently claimed (and a
0recovery window collapsed the breaker β now clamped to β₯ 1 ms); hash-ring re-add leaked/clobbered ~150 stale vnodes; weighted-RR starved endpoints when all weights were < 1.0. - aggregator β a zero
summary_intervalpanicked the spawned task;filter_noveldeduped onfold_kindonly and re-published multi-row summaries every tick. - meshos reconcile / ICE β duplicate
RequestEvictionper tick;MarkAvoidre-emitted every tick;ThawClusterblocked by the cluster cooldown (break-glass violation). - deck streams β
deck-ffireported genuine stream-end as a timeout (livelocking Go polling loops);AuditStream/LogStream/FailureStreamcould park forever by not re-arming the waker; exported timestamps printed an epoch hour-count (missing% 24). - nRPC / routing β duplicate in-flight
call_idoverwrote the prior caller's response sender;mint_random_call_idreturned0ongetrandomfailure; a route owner couldn't update its own route to a worse metric. - RedEX β
OutstandingRequestscap only evicted expired entries (unbounded under load) β re-backed withlru::LruCachefor an O(1) hard bound; per-entry checksum header coverage hardened (corruptseqnow caught by a monotonicity walk); age-based retention no longer assumes a monotonic wall-clock; catch-up TOCTOU + a 32-bit overflow guard. - cortex FFI β five
(out_json, out_len)functions now honor the out-param pre-zero contract;net_rpc_duplex_into_splitno longer drops the surviving half on partial-consume. Plus out-param null checks andlen > isize::MAXguards across the Go*-fficrates.
Investigated / deferred (not shipped)
- Anti-replay
MAX_FORWARD(reported HIGH) β downgraded to INFO and reverted. Not an exploitable replay bypass; the forward-jump tolerance is deliberate design (survives > 1024-packet loss without a forced re-handshake, and stale counters are still caught by the age check). The proposed hardening broke 4 replay-window tests with no security gain. - Deferred:
publish_to_peerevent-count chunking (apublish_many> 2028 events still trips a release-modeassert!);C.GoBytesβ₯ 2 GiB truncation (~20 sites); the FFI seed-pointer length check (its companion guards landed). Each needs a non-trivial or breaking-ABI change unsuitable for a bugfix release.
Dependencies
All in net/crates/net/Cargo.lock β no Cargo.toml change, so crates.io library consumers resolve identically; these reach only the distributed artifacts (CLI, FFI staticlibs, npm prebuilds, Python wheels, deck):
- Transitive bumps:
redis1.2.3,syn2.0.118,napi3.9.3,bytes1.12.0,h20.4.15,time0.3.49,getrandom0.4.3,webpki-roots1.0.8. - Footprint reduction: a transitive WASM component-model toolchain (
wit-bindgen-*,wit-component,wit-parser,wasm-encoder/-metadata/-parser,wasip3,leb128fmt,id-arena,prettyplease,unicode-xid) dropped out;foldhash/hashbrownshed a duplicate major. Nothing reaches the datapath, crypto, or wire.
Upgrade notes
- Breaking changes: none on the wire, in the C ABI, or in the public Go/Python API. The Go fixes are internal lock-discipline changes behind unchanged signatures.
- One behavioural fix to note:
deck-ffistream functions now correctly returnEND_OF_STREAMon a closed stream for non-zero timeouts instead of a silentOK/NULL. A Go loop that previously spun on(nil, nil)will now terminate as documented. - Verification caveat: this build environment has no cgo toolchain, so the Go module fixes were validated by
gofmt+ manual review + a pure-Go guard test, not a cgo compile/link. A cgo build on a release runner is the recommended gate before publishing the Go module tag.