v5.10.3
T2 daemon reliability fix.
Fixed
- T2 daemon no longer pegged to ~100% CPU (and T2 writes no longer fail
withdatabase is locked) by a clientreclaim_staleRPC flood
(nexus-xmohw). Aspect-queue stale-row reclaim is daemon-owned since
nexus-we61e — the daemon's own loop calls it directly once per interval.
But version-skewed workers from before we61e (conexus <=5.10.0) still RPC
aspect_queue.reclaim_staleon every poll, and the daemon honoured each as
a real full-table UPDATE+commit. With several stale workers that floods the
SQLite write lock, pegs a core, makes the daemon slow to answerhello(),
and triggers ensure-running to spawn a replacement — a takeover churn that
leaves two daemons contending onmemory.db, at which pointnx memory putfails outright. The client-facing dispatch entry for
aspect_queue.reclaim_staleis now a cheap no-op returning 0 that never
touches the DB; the daemon's own reclaim loop is unaffected. A one-shot
WARNING per daemon surfaces that stale workers are still present so an
operator can restart them.