refactor: break circular dependency over net_processing and dkgsessionhandler#7314
refactor: break circular dependency over net_processing and dkgsessionhandler#7314knst wants to merge 12 commits into
Conversation
|
Review GateCommit:
|
WalkthroughThis PR separates LLMQ DKG networking and thread management into a new llmq::NetDKG handler (and NetDKGStub), changes CDKGSession/ActiveDKGSession phase hooks to return std::optional payloads instead of enqueueing, refactors pending-message APIs to accept pre-serialized payloads, updates CDKGSessionManager/ActiveDKGSessionHandler interfaces, simplifies Active/Observer context constructors and init wiring, and decouples PeerManager from ActiveContext by passing a CActiveMasternodeManager* nodeman. It also adds debug helpers and a spork check for enabling DKG. Sequence Diagram(s)sequenceDiagram
participant Node as DKG Node
participant NetDKG as NetDKG Handler
participant SessionMgr as CDKGSessionManager
participant SessionHdlr as CDKGSessionHandler
participant ActiveDKG as ActiveDKGSession
Node->>NetDKG: ProcessMessage(MSG_QUORUM_CONTRIB)
NetDKG->>SessionMgr: route message (llmqType, quorumIndex)
SessionMgr->>SessionHdlr: enqueue(serialized, hash)
Note over SessionHdlr: Batches message in pending queues
NetDKG->>NetDKG: HandleDKGRound()
loop per_phase
NetDKG->>SessionHdlr: ProcessPendingMessageBatch()
SessionHdlr->>ActiveDKG: Contribute()/VerifyAndCommit()
ActiveDKG-->>SessionHdlr: std::optional<Message>
SessionHdlr->>NetDKG: RelayInvToParticipants()
end
sequenceDiagram
participant Init as Initialization
participant ActiveCtx as ActiveContext
participant PeerMgr as PeerManager
participant NetDKG as NetDKG Handler
participant Spork as CSporkManager
Init->>ActiveCtx: construct(dmnman, qman, qsnapman, sigman)
Init->>PeerMgr: make(nodeman=active_ctx ? active_ctx->nodeman.get() : nullptr)
Note over PeerMgr: m_nodeman set
Init->>NetDKG: construct(sporkman, dkgsman, qman)
Init->>Spork: IsQuorumDKGEnabled()
Note over NetDKG: Check spork for DKG enabled
Init->>ActiveCtx: Start()
Note over ActiveCtx: Start() takes no connman/peerman args
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (3)
src/active/dkgsession.cpp (1)
106-111: ⚡ Quick winMove the
sent*debug updates to the actual enqueue/broadcast path.These methods now only build and return a message. Setting
sentContributions,sentComplaint,sentJustification, andsentPrematureCommitmenthere records a successful send beforeNetDKGhas actually serialized and queued/broadcast the payload.Also applies to: 292-297, 382-387, 539-544
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/active/dkgsession.cpp` around lines 106 - 111, The dkgDebugManager.UpdateLocalSessionStatus calls inside the message-builder functions (e.g., setting CDKGDebugSessionStatus::statusBits.sentContributions, sentComplaint, sentJustification, sentPrematureCommitment) must be removed from those builders (the functions that build and return qc/messages) and moved into the actual send path inside NetDKG — i.e., the code that serializes and enqueues/broadcasts the payload. Locate the UpdateLocalSessionStatus calls in the builders and delete them there, then add equivalent UpdateLocalSessionStatus updates immediately after NetDKG performs the serialization/queuing/broadcast so the debug flags reflect a real successful send.src/llmq/debug.cpp (1)
213-228: 💤 Low valueOptional: make
MarkAbortedidempotent w.r.t.nTime.
MarkAborted's lambda always returnstrue, so each call bumpslocalStatus.nTimeeven when the session was already marked aborted.MarkPhaseAdvancedalready does the right thing (returnschanged). For consistency and to avoid spurious timestamp updates if the helper is invoked more than once on the same aborted session, consider returning a real changed flag.♻️ Proposed change
void CDKGDebugManager::MarkAborted(Consensus::LLMQType llmqType, int quorumIndex) { UpdateLocalSessionStatus(llmqType, quorumIndex, [&](CDKGDebugSessionStatus& status) { + if (status.statusBits.aborted) return false; status.statusBits.aborted = true; return true; }); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/llmq/debug.cpp` around lines 213 - 228, MarkAborted currently always returns true from its UpdateLocalSessionStatus lambda which forces localStatus.nTime to update every call; change the lambda in CDKGDebugManager::MarkAborted to compute a changed flag by comparing status.statusBits.aborted with the new value, set status.statusBits.aborted = true, and return that changed flag (i.e., return status.statusBits.aborted was previously false). This makes MarkAborted idempotent like MarkPhaseAdvanced and avoids spurious nTime updates.src/llmq/net_dkg.cpp (1)
449-482: 💤 Low valueInconsistent
dynamic_castusage betweenStart()andInterrupt(); consider tightening shutdown.
Start()uses the throwing reference form (dynamic_cast<ActiveDKGSessionHandler&>) whileInterrupt()uses the safe pointer form. Both iterate the same handler set and both early-return onm_active == nullptr, so the invariant is identical and the two should agree.The reference form also has a small resilience gap: if the cast were ever to throw mid-iteration, the threads already pushed into
m_phase_threadswould never be joined, because~NetDKG()only callsDisconnectManagers()(line 254), notStop(). Either use the pointer form here as well, or have the destructor callStop()defensively so a partially-initialized state still cleans up.♻️ Proposed alignment with `Interrupt()`
m_qdkgsman.ForEachHandler([this](CDKGSessionHandler& base) { - auto& handler = dynamic_cast<ActiveDKGSessionHandler&>(base); - std::string thread_name = strprintf("llmq-%d-%d", std23::to_underlying(handler.params.type), handler.QuorumIndex()); - m_phase_threads.emplace_back([this, name = std::move(thread_name), &handler] { - util::TraceThread(name.c_str(), [this, &handler] { PhaseHandlerThread(handler); }); - }); + auto* handler = dynamic_cast<ActiveDKGSessionHandler*>(&base); + if (!Assume(handler != nullptr)) return; + std::string thread_name = strprintf("llmq-%d-%d", std23::to_underlying(handler->params.type), handler->QuorumIndex()); + m_phase_threads.emplace_back([this, name = std::move(thread_name), handler] { + util::TraceThread(name.c_str(), [this, handler] { PhaseHandlerThread(*handler); }); + }); });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/llmq/net_dkg.cpp` around lines 449 - 482, Start() uses dynamic_cast<ActiveDKGSessionHandler&> which can throw partway through filling m_phase_threads and leave threads unjoined; make Start() mirror Interrupt() by using the non-throwing pointer form (dynamic_cast<ActiveDKGSessionHandler*>) when iterating m_qdkgsman.ForEachHandler so you only create threads for valid handlers and avoid exceptions during the loop, ensuring m_phase_threads remains consistent for later Stop() join; update the lambda in NetDKG::Start to check the pointer, capture it safely, and call PhaseHandlerThread(handler) with the pointer/ref as appropriate.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@src/active/dkgsession.cpp`:
- Around line 106-111: The dkgDebugManager.UpdateLocalSessionStatus calls inside
the message-builder functions (e.g., setting
CDKGDebugSessionStatus::statusBits.sentContributions, sentComplaint,
sentJustification, sentPrematureCommitment) must be removed from those builders
(the functions that build and return qc/messages) and moved into the actual send
path inside NetDKG — i.e., the code that serializes and enqueues/broadcasts the
payload. Locate the UpdateLocalSessionStatus calls in the builders and delete
them there, then add equivalent UpdateLocalSessionStatus updates immediately
after NetDKG performs the serialization/queuing/broadcast so the debug flags
reflect a real successful send.
In `@src/llmq/debug.cpp`:
- Around line 213-228: MarkAborted currently always returns true from its
UpdateLocalSessionStatus lambda which forces localStatus.nTime to update every
call; change the lambda in CDKGDebugManager::MarkAborted to compute a changed
flag by comparing status.statusBits.aborted with the new value, set
status.statusBits.aborted = true, and return that changed flag (i.e., return
status.statusBits.aborted was previously false). This makes MarkAborted
idempotent like MarkPhaseAdvanced and avoids spurious nTime updates.
In `@src/llmq/net_dkg.cpp`:
- Around line 449-482: Start() uses dynamic_cast<ActiveDKGSessionHandler&> which
can throw partway through filling m_phase_threads and leave threads unjoined;
make Start() mirror Interrupt() by using the non-throwing pointer form
(dynamic_cast<ActiveDKGSessionHandler*>) when iterating
m_qdkgsman.ForEachHandler so you only create threads for valid handlers and
avoid exceptions during the loop, ensuring m_phase_threads remains consistent
for later Stop() join; update the lambda in NetDKG::Start to check the pointer,
capture it safely, and call PhaseHandlerThread(handler) with the pointer/ref as
appropriate.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 757eb414-ab77-46e9-b643-a3f32d98e788
📒 Files selected for processing (25)
src/Makefile.amsrc/active/context.cppsrc/active/context.hsrc/active/dkgsession.cppsrc/active/dkgsession.hsrc/active/dkgsessionhandler.cppsrc/active/dkgsessionhandler.hsrc/init.cppsrc/llmq/debug.cppsrc/llmq/debug.hsrc/llmq/dkgsession.hsrc/llmq/dkgsessionhandler.cppsrc/llmq/dkgsessionhandler.hsrc/llmq/dkgsessionmgr.cppsrc/llmq/dkgsessionmgr.hsrc/llmq/net_dkg.cppsrc/llmq/net_dkg.hsrc/llmq/observer.cppsrc/llmq/observer.hsrc/llmq/options.cppsrc/llmq/options.hsrc/net_processing.cppsrc/net_processing.hsrc/test/util/setup_common.cpptest/lint/lint-circular-dependencies.py
💤 Files with no reviewable changes (1)
- test/lint/lint-circular-dependencies.py
53be42b to
f4b6aae
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (7)
src/llmq/net_dkg.cpp (6)
100-110: 💤 Low valueDefensive null-check in fallback verification loop.
In the per-message fallback verification loop (lines 100-110),
memberis dereferenced at line 106 (member->dmn->pdmnState->pubKeyOperator.Get()) without a null check. The reasoning is that anynodeIdwhoseGetMember()returnednullptrin the first loop was already inserted intoret(line 54) and is filtered byret.count(nodeId)(line 101). That holds today, but an in-flight membership change between the twoGetMembercalls (or future refactors) would silently NPE here.A cheap guard would future-proof this:
🛡️ Proposed defensive check
auto member = session.GetMember(msg->proTxHash); + if (!member) { + ret.emplace(nodeId); + continue; + } bool valid = msg->sig.VerifyInsecure(member->dmn->pdmnState->pubKeyOperator.Get(), msg->GetSignHash());🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/llmq/net_dkg.cpp` around lines 100 - 110, The fallback verification loop can dereference a null member returned by session.GetMember; update the loop that iterates over messages to defensively check the result of session.GetMember(msg->proTxHash) before using member->dmn->pdmnState->pubKeyOperator.Get(): if member is null, insert nodeId into ret and continue, otherwise perform msg->sig.VerifyInsecure(..., msg->GetSignHash()) as before; this prevents a potential NPE if membership changes between the two GetMember calls.
466-473: 💤 Low value
Stop()is not idempotent w.r.t. unjoinable threads after early failure.
Stop()callsInterrupt()then joins all threads. If any thread inm_phase_threadsis in a non-joinable state (e.g., already joined orstd::threaddefault-constructed because creation failed),t.join()is skipped via thejoinable()guard — good. But if a thread is running and throws an unhandled exception before stop is requested,PhaseHandlerThreadonly catchesAbortPhaseException(line 492); any other exception will terminate the thread perstd::terminate. The catch-all loop inPhaseHandlerThreaddoes not include a genericcatch (...).Consider adding a final
catch (const std::exception&)to log and break the loop cleanly, soStop()always sees joinable but exited threads.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/llmq/net_dkg.cpp` around lines 466 - 473, Stop() can hang on threads that terminated due to unexpected exceptions because PhaseHandlerThread only catches AbortPhaseException; add a broad exception handler inside PhaseHandlerThread's main loop (after the AbortPhaseException catch) that catches const std::exception& (and optionally catch(...) as a fallback), logs the error with context, and breaks/returns cleanly so the std::thread object exits normally and becomes joinable for Stop() to join; reference PhaseHandlerThread, AbortPhaseException, m_phase_threads and Interrupt() when locating where to add the catch and logging.
244-244: 💤 Low valueRedundant temporary in
m_activeconstruction.
std::make_unique<ActiveDKG>(ActiveDKG{...})constructs a temporaryActiveDKGand then move/copy-constructs another insidemake_unique. You can forward the args directly:♻️ Proposed simplification
- m_active{std::make_unique<ActiveDKG>(ActiveDKG{dmnman, mn_metaman, dkgdbgman, qblockman, qsnapman, connman})} + m_active{std::make_unique<ActiveDKG>(dmnman, mn_metaman, dkgdbgman, qblockman, qsnapman, connman)}This requires
ActiveDKGto have an aggregate-equivalent constructor or a matching one; if it currently relies on aggregate brace-init, add an explicit constructor.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/llmq/net_dkg.cpp` at line 244, The m_active member is being initialized with a redundant temporary via std::make_unique<ActiveDKG>(ActiveDKG{...}); change the call to forward the constructor args directly (e.g. std::make_unique<ActiveDKG>(dmnman, mn_metaman, dkgdbgman, qblockman, qsnapman, connman)) so no extra temporary is created, and if ActiveDKG does not currently have a matching constructor add an explicit constructor on ActiveDKG that accepts (dmnman, mn_metaman, dkgdbgman, qblockman, qsnapman, connman) to allow direct in-place construction.
216-255: ⚖️ Poor tradeoffTwo near-identical constructors invite drift.
The observer (lines 216-231) and active (lines 233-253) constructors share initialization of
m_qdkgsman,m_qman,m_sporkman,m_chainman,m_quorums_watch, them_qdkgsman.InitializeHandlers(...)call, and the closingm_qman.ConnectManagers(...)call. The only differences are:
m_active(nullptrvs constructed)- the lambda type returned by
InitializeHandlers(CDKGSessionHandlervsActiveDKGSessionHandler)Consider delegating the active constructor to the observer one, or factor common init into a private helper, to keep the two paths (and the
ConnectManagers/DisconnectManagerslifecycle) in sync going forward.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/llmq/net_dkg.cpp` around lines 216 - 255, The two NetDKG constructors duplicate common initialization (m_qdkgsman, m_qman, m_sporkman, m_chainman, m_quorums_watch, the InitializeHandlers/ConnectManagers lifecycle) and should be unified: extract the shared setup into a private helper (e.g., InitCommon or InitializeManagers) that takes a std::function<std::unique_ptr<CDKGSessionHandler>(const Consensus::LLMQParams&, int)> or a template/lambda wrapper, or else implement constructor delegation from the active constructor to the observer constructor and only set m_active afterward; update the observer constructor to call the helper (or be the delegated target) and change the active constructor to only construct m_active and supply the ActiveDKGSessionHandler-producing lambda to the shared InitializeHandlers call, ensuring m_qman.ConnectManagers and m_qdkgsman.InitializeHandlers remain in one place.
451-464: ⚡ Quick win
dynamic_cast<ActiveDKGSessionHandler&>at line 458 will throw on type mismatch.
Start()is guarded bym_active == nullptr(line 452), and the active-mode constructor (lines 246-251) installsActiveDKGSessionHandlerinstances, so in practice the cast succeeds. However,m_qdkgsman.InitializeHandlers(...)is the only invariant that tiesm_active != nullptrto "all handlers areActiveDKGSessionHandler", and it lives outside this class. If that invariant is ever violated (a future code path that mixes handler types, or a partially initialized state),dynamic_caston a reference will throwstd::bad_castfrom inside this lambda chain and may abort the node.
Interrupt()(line 479) already uses the safer pointer form. Consider mirroring that here:🛡️ Safer cast in `Start()`
m_qdkgsman.ForEachHandler([this](CDKGSessionHandler& base) { - auto& handler = dynamic_cast<ActiveDKGSessionHandler&>(base); - std::string thread_name = strprintf("llmq-%d-%d", std23::to_underlying(handler.params.type), handler.QuorumIndex()); - m_phase_threads.emplace_back([this, name = std::move(thread_name), &handler] { - util::TraceThread(name.c_str(), [this, &handler] { PhaseHandlerThread(handler); }); - }); + auto* handler = dynamic_cast<ActiveDKGSessionHandler*>(&base); + if (!handler) return; + std::string thread_name = strprintf("llmq-%d-%d", std23::to_underlying(handler->params.type), + handler->QuorumIndex()); + m_phase_threads.emplace_back([this, name = std::move(thread_name), handler] { + util::TraceThread(name.c_str(), [this, handler] { PhaseHandlerThread(*handler); }); + }); });🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/llmq/net_dkg.cpp` around lines 451 - 464, The dynamic_cast in Start() currently uses dynamic_cast<ActiveDKGSessionHandler&> inside the m_qdkgsman.ForEachHandler lambda which will throw std::bad_cast on mismatch; change it to use the pointer form dynamic_cast<ActiveDKGSessionHandler*> (matching Interrupt()) and check for nullptr before using handler: if the cast returns nullptr, log or skip creating the phase thread for that handler instead of dereferencing/throwing, otherwise capture the pointer and pass it to PhaseHandlerThread; update references in the lambda to use the pointer (e.g., handler->QuorumIndex(), PhaseHandlerThread(*handler)) so Start() no longer can throw from a bad_cast.
257-294: 💤 Low value
vRecvis moved intopmlater — make the rewind/peek pattern explicit.The flow at lines 291-294 reads
llmqType(1 byte) andquorumHash(32 bytes) and then rewinds both, with the goal of leavingvRecvuntouched so the entire payload (including header bytes) can be hashed and stored at lines 349-352. This works, but:
- The rewind order (uint256 then uint8_t) is the reverse of read order; that's intentional for
Rewindbut easy to break in the future.- A short comment would make the intent (peek-only) obvious to maintainers.
♻️ Suggested clarifying comment
+ // Peek llmqType + quorumHash without consuming so the full payload can be re-serialized + // and hashed below at line 349. Consensus::LLMQType llmqType; uint256 quorumHash; vRecv >> llmqType; vRecv >> quorumHash; vRecv.Rewind(sizeof(uint256)); vRecv.Rewind(sizeof(uint8_t));🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/llmq/net_dkg.cpp` around lines 257 - 294, The code in NetDKG::ProcessMessage reads llmqType and quorumHash from vRecv and then calls vRecv.Rewind in the reverse order to restore the stream for later hashing; make this intent explicit and less fragile by adding a short comment above the reads/Rewind calls describing that we are only peeking at the header bytes (uint8_t then uint256) and that the subsequent Rewind calls intentionally undo the reads so the full payload remains unchanged for hashing (used later when moving vRecv into pm). Optionally, reorder the Rewind calls to mirror the read order (Rewind(sizeof(uint8_t)) then Rewind(sizeof(uint256))) to reduce future mistakes, while keeping the explanatory comment.src/llmq/dkgsessionhandler.h (1)
112-121: 💤 Low valueDuplicate
public:access specifiers.Lines 112 and 121 both declare
public:with no interveningprivate:/protected:block, making the second specifier redundant. Consider consolidating into a single section.♻️ Proposed cleanup
public: const Consensus::LLMQParams& params; // Do not guard these, they protect their internals themselves CDKGPendingMessages pendingContributions; CDKGPendingMessages pendingComplaints; CDKGPendingMessages pendingJustifications; CDKGPendingMessages pendingPrematureCommitments; -public: explicit CDKGSessionHandler(const Consensus::LLMQParams& _params); virtual ~CDKGSessionHandler();🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/llmq/dkgsessionhandler.h` around lines 112 - 121, The class header contains two consecutive "public:" access specifiers (surrounding members like params and CDKGPendingMessages pendingContributions/pendingComplaints/pendingJustifications/pendingPrematureCommitments), making the second one redundant; remove the duplicate "public:" and consolidate these members under a single public section so there is only one "public:" before params and the pending* members.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/llmq/dkgsession.cpp`:
- Around line 627-629: The empty stubs for CDKGSession::FinalizeCommitments()
and CDKGSession::FinalizeSingleCommitment() silently disable producing mineable
commitments in HandleDKGRound; either restore the original finalize logic into
these two methods (port the pre-refactor bodies into
CDKGSession::FinalizeCommitments and CDKGSession::FinalizeSingleCommitment so
they return the actual CFinalCommitment(s) used by
qblockman.AddMineableCommitment(...)), or if this work is intentionally
deferred, replace the stubs with an explicit TODO and a runtime-safe guard: log
an error (process/ulogs) and assert or throw so the failure is visible (do not
silently return {}), referencing the methods by name (FinalizeCommitments and
FinalizeSingleCommitment) and ensuring HandleDKGRound’s expectations are met.
In `@src/llmq/net_dkg.cpp`:
- Around line 384-411: AlreadyHave reports DKG invs seen by observer nodes
(m_quorums_watch) but ProcessGetData currently returns early when m_active ==
nullptr causing watchers to claim possession but refuse getdata; remove the
early "if (m_active == nullptr) return false;" check from NetDKG::ProcessGetData
so that m_qdkgsman.GetContribution/GetComplaint/etc. are dynamically dispatched
(base virtuals will return false in non-serving handlers), and keep or update
the explanatory comment to note that observer handlers will be consulted via
m_qdkgsman even when m_active is null.
- Around line 375-379: The current code calls
m_peer_manager->PeerMisbehaving(pfrom.GetId(), 100) when DoForHandler indicates
no session handler (dispatched == false), which is too punitive; change the
penalty to 10 (i.e., m_peer_manager->PeerMisbehaving(pfrom.GetId(), 10)) or
alternatively remove the ban and replace it with a LogPrint/LogPrintf so
legitimate peers slightly ahead aren't banned—update the branch in NetDKG where
dispatched is checked (the block that logs "NetDKG -- no session handlers for
quorumIndex" and calls PeerMisbehaving) to apply the lower score or silent log
as discussed.
---
Nitpick comments:
In `@src/llmq/dkgsessionhandler.h`:
- Around line 112-121: The class header contains two consecutive "public:"
access specifiers (surrounding members like params and CDKGPendingMessages
pendingContributions/pendingComplaints/pendingJustifications/pendingPrematureCommitments),
making the second one redundant; remove the duplicate "public:" and consolidate
these members under a single public section so there is only one "public:"
before params and the pending* members.
In `@src/llmq/net_dkg.cpp`:
- Around line 100-110: The fallback verification loop can dereference a null
member returned by session.GetMember; update the loop that iterates over
messages to defensively check the result of session.GetMember(msg->proTxHash)
before using member->dmn->pdmnState->pubKeyOperator.Get(): if member is null,
insert nodeId into ret and continue, otherwise perform
msg->sig.VerifyInsecure(..., msg->GetSignHash()) as before; this prevents a
potential NPE if membership changes between the two GetMember calls.
- Around line 466-473: Stop() can hang on threads that terminated due to
unexpected exceptions because PhaseHandlerThread only catches
AbortPhaseException; add a broad exception handler inside PhaseHandlerThread's
main loop (after the AbortPhaseException catch) that catches const
std::exception& (and optionally catch(...) as a fallback), logs the error with
context, and breaks/returns cleanly so the std::thread object exits normally and
becomes joinable for Stop() to join; reference PhaseHandlerThread,
AbortPhaseException, m_phase_threads and Interrupt() when locating where to add
the catch and logging.
- Line 244: The m_active member is being initialized with a redundant temporary
via std::make_unique<ActiveDKG>(ActiveDKG{...}); change the call to forward the
constructor args directly (e.g. std::make_unique<ActiveDKG>(dmnman, mn_metaman,
dkgdbgman, qblockman, qsnapman, connman)) so no extra temporary is created, and
if ActiveDKG does not currently have a matching constructor add an explicit
constructor on ActiveDKG that accepts (dmnman, mn_metaman, dkgdbgman, qblockman,
qsnapman, connman) to allow direct in-place construction.
- Around line 216-255: The two NetDKG constructors duplicate common
initialization (m_qdkgsman, m_qman, m_sporkman, m_chainman, m_quorums_watch, the
InitializeHandlers/ConnectManagers lifecycle) and should be unified: extract the
shared setup into a private helper (e.g., InitCommon or InitializeManagers) that
takes a std::function<std::unique_ptr<CDKGSessionHandler>(const
Consensus::LLMQParams&, int)> or a template/lambda wrapper, or else implement
constructor delegation from the active constructor to the observer constructor
and only set m_active afterward; update the observer constructor to call the
helper (or be the delegated target) and change the active constructor to only
construct m_active and supply the ActiveDKGSessionHandler-producing lambda to
the shared InitializeHandlers call, ensuring m_qman.ConnectManagers and
m_qdkgsman.InitializeHandlers remain in one place.
- Around line 451-464: The dynamic_cast in Start() currently uses
dynamic_cast<ActiveDKGSessionHandler&> inside the m_qdkgsman.ForEachHandler
lambda which will throw std::bad_cast on mismatch; change it to use the pointer
form dynamic_cast<ActiveDKGSessionHandler*> (matching Interrupt()) and check for
nullptr before using handler: if the cast returns nullptr, log or skip creating
the phase thread for that handler instead of dereferencing/throwing, otherwise
capture the pointer and pass it to PhaseHandlerThread; update references in the
lambda to use the pointer (e.g., handler->QuorumIndex(),
PhaseHandlerThread(*handler)) so Start() no longer can throw from a bad_cast.
- Around line 257-294: The code in NetDKG::ProcessMessage reads llmqType and
quorumHash from vRecv and then calls vRecv.Rewind in the reverse order to
restore the stream for later hashing; make this intent explicit and less fragile
by adding a short comment above the reads/Rewind calls describing that we are
only peeking at the header bytes (uint8_t then uint256) and that the subsequent
Rewind calls intentionally undo the reads so the full payload remains unchanged
for hashing (used later when moving vRecv into pm). Optionally, reorder the
Rewind calls to mirror the read order (Rewind(sizeof(uint8_t)) then
Rewind(sizeof(uint256))) to reduce future mistakes, while keeping the
explanatory comment.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: d8950d10-63a8-42c9-9f3f-6963acd754d2
📒 Files selected for processing (16)
src/active/context.cppsrc/active/context.hsrc/active/dkgsession.cppsrc/init.cppsrc/llmq/dkgsession.cppsrc/llmq/dkgsession.hsrc/llmq/dkgsessionhandler.cppsrc/llmq/dkgsessionhandler.hsrc/llmq/dkgsessionmgr.cppsrc/llmq/dkgsessionmgr.hsrc/llmq/net_dkg.cppsrc/llmq/net_dkg.hsrc/llmq/observer.cppsrc/llmq/observer.hsrc/llmq/options.cppsrc/llmq/options.h
🚧 Files skipped from review as they are similar to previous changes (12)
- src/llmq/options.cpp
- src/llmq/observer.cpp
- src/llmq/options.h
- src/llmq/net_dkg.h
- src/llmq/dkgsessionmgr.cpp
- src/active/context.h
- src/llmq/dkgsessionhandler.cpp
- src/llmq/observer.h
- src/llmq/dkgsessionmgr.h
- src/llmq/dkgsession.h
- src/active/dkgsession.cpp
- src/init.cpp
| std::vector<CFinalCommitment> CDKGSession::FinalizeCommitments() { return {}; } | ||
|
|
||
| CFinalCommitment CDKGSession::FinalizeSingleCommitment() { return {}; } |
There was a problem hiding this comment.
Empty FinalizeCommitments() and FinalizeSingleCommitment() will silently break DKG output.
Both implementations return an empty vector and a default-constructed (null) CFinalCommitment respectively. These are the only paths in HandleDKGRound (see src/llmq/net_dkg.cpp lines 522-527 and 594-599) that produce mineable commitments via qblockman.AddMineableCommitment(...). With these stubs:
FinalizeSingleCommitment()returns{}whichIsNull()reports true → branch is skipped, no single-member commitment is ever produced.FinalizeCommitments()returns an empty vector → theforloop never executes, no multi-member commitment is ever relayed/mined.
If this is intentional (work-in-progress to be filled in by a follow-up commit/PR), please add a TODO/FIXME and a clear log/assert so the regression is not invisible at runtime; otherwise DKG silently never completes a quorum on this branch.
Could you confirm whether the original (pre-refactor) finalize logic is supposed to land in this PR or in a follow-up? I can help port the previous body if you'd like.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/llmq/dkgsession.cpp` around lines 627 - 629, The empty stubs for
CDKGSession::FinalizeCommitments() and CDKGSession::FinalizeSingleCommitment()
silently disable producing mineable commitments in HandleDKGRound; either
restore the original finalize logic into these two methods (port the
pre-refactor bodies into CDKGSession::FinalizeCommitments and
CDKGSession::FinalizeSingleCommitment so they return the actual
CFinalCommitment(s) used by qblockman.AddMineableCommitment(...)), or if this
work is intentionally deferred, replace the stubs with an explicit TODO and a
runtime-safe guard: log an error (process/ulogs) and assert or throw so the
failure is visible (do not silently return {}), referencing the methods by name
(FinalizeCommitments and FinalizeSingleCommitment) and ensuring HandleDKGRound’s
expectations are met.
| if (!dispatched) { | ||
| LogPrintf("NetDKG -- no session handlers for quorumIndex [%d]\n", quorumIndex); | ||
| m_peer_manager->PeerMisbehaving(pfrom.GetId(), 100); | ||
| return; | ||
| } |
There was a problem hiding this comment.
PeerMisbehaving(100) for missing handler may be too punitive.
If DoForHandler returns false because no session handler exists for {llmqType, quorumIndex} (e.g., handlers haven't been initialized yet for this combination, or this node hasn't discovered that quorum is mineable), we instantly ban the peer with score 100. Earlier paths use 10 for "we might be lagging behind" cases (e.g., line 319). If a remote peer is slightly ahead, we'd ban legitimate participants.
Consider lowering to 10, or returning silently with a LogPrint, since the peer's message itself was structurally valid up to this point.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/llmq/net_dkg.cpp` around lines 375 - 379, The current code calls
m_peer_manager->PeerMisbehaving(pfrom.GetId(), 100) when DoForHandler indicates
no session handler (dispatched == false), which is too punitive; change the
penalty to 10 (i.e., m_peer_manager->PeerMisbehaving(pfrom.GetId(), 10)) or
alternatively remove the ban and replace it with a LogPrint/LogPrintf so
legitimate peers slightly ahead aren't banned—update the branch in NetDKG where
dispatched is checked (the block that logs "NetDKG -- no session handlers for
quorumIndex" and calls PeerMisbehaving) to apply the lower score or silent log
as discussed.
| bool NetDKG::AlreadyHave(const CInv& inv) | ||
| { | ||
| switch (inv.type) { | ||
| case MSG_QUORUM_CONTRIB: | ||
| case MSG_QUORUM_COMPLAINT: | ||
| case MSG_QUORUM_JUSTIFICATION: | ||
| case MSG_QUORUM_PREMATURE_COMMITMENT: { | ||
| if (!IsQuorumDKGEnabled(m_sporkman)) return false; | ||
| bool seen = false; | ||
| m_qdkgsman.ForEachHandler([&](CDKGSessionHandler& h) { | ||
| if (seen) return; | ||
| if (h.pendingContributions.HasSeen(inv.hash) || h.pendingComplaints.HasSeen(inv.hash) || | ||
| h.pendingJustifications.HasSeen(inv.hash) || h.pendingPrematureCommitments.HasSeen(inv.hash)) { | ||
| seen = true; | ||
| } | ||
| }); | ||
| return seen; | ||
| } | ||
| } | ||
| return false; | ||
| } | ||
|
|
||
| bool NetDKG::ProcessGetData(CNode& pfrom, const CInv& inv, CConnman& connman, const CNetMsgMaker& msgMaker) | ||
| { | ||
| // Default implementations of GetContribution and the other virtual methods | ||
| // return false in observer mode; m_active is only an early exit and does | ||
| // not affect logic. | ||
| if (m_active == nullptr) return false; |
There was a problem hiding this comment.
Asymmetry between AlreadyHave and ProcessGetData for observer-mode (qwatch) nodes.
AlreadyHave (line 391) only gates on IsQuorumDKGEnabled(m_sporkman) and walks every handler's pending sets — observer/m_quorums_watch nodes will report true for DKG inv hashes they've seen.
ProcessGetData (line 411), however, short-circuits to false whenever m_active == nullptr, even though m_quorums_watch == true nodes do receive and forward DKG messages. The result: a watching node will tell peers "I already have this" via AlreadyHave, but then refuse to serve it via getdata. Peers can be left waiting/timing out instead of fetching from another source.
The inline comment ("default implementations of GetContribution... return false in observer mode") is true for the base CDKGSessionHandler, but the dispatch goes through m_qdkgsman.GetContribution(...) which dynamic-dispatches to the active handler in active mode. For pure m_quorums_watch=true (no m_active) the base virtuals do return false anyway — so the early return is redundant for that case and harmful for the truly-observer case.
Consider either:
- Removing the
m_active == nullptrearly return and relying on the virtual dispatch to returnfalsenaturally, or - Symmetrically gating
AlreadyHaveso it returnsfalsewhenm_active == nullptr.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/llmq/net_dkg.cpp` around lines 384 - 411, AlreadyHave reports DKG invs
seen by observer nodes (m_quorums_watch) but ProcessGetData currently returns
early when m_active == nullptr causing watchers to claim possession but refuse
getdata; remove the early "if (m_active == nullptr) return false;" check from
NetDKG::ProcessGetData so that m_qdkgsman.GetContribution/GetComplaint/etc. are
dynamically dispatched (base virtuals will return false in non-serving
handlers), and keep or update the explanatory comment to note that observer
handlers will be consulted via m_qdkgsman even when m_active is null.
|
This pull request has conflicts, please rebase. |
248ccaf to
c1c4e2a
Compare
|
This pull request has conflicts, please rebase. |
It shows the hidden circular dependency and tidy up list of includes
- removed method CDKGPendingMessages::Misbehaving(NodeId, int, PeerManager&), ProcessPendingMessageBatch calls peerman.Misbehaving(...) directly - renamed PushPendingMessage<Message>(NodeId, Message&, PeerManager&) to PushOwnPendingMessage for clear distinction of path with node=-1 (self made)
…from PeerManager Re-ordered initialization of PeerManager and ActiveContext / ObserverContext, PeerManager::make now takes nodeman raw ptr (or nullptr). It resolves several circular dependencies over net_processing and removes several unique_ptr<T&> work-arounds from PeerManager
It helps to drop dependency of llmq/dkgsessionhandler on network code
- moved implementation of ProcessMessage and AlreadyHave to NetDKG - drop usages of MessageProcessingResult in CDKGSessionManager - introduced a new helper DoForHandler
c1c4e2a to
22d5cec
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (2)
src/llmq/net_dkg.cpp (2)
375-379:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winPenalty of 100 for missing handler may be too punitive.
At line 319, a penalty of 10 is applied for unknown
quorumHashwith the rationale "we might be lagging behind." The same rationale applies here: ifDoForHandlerreturns false because no session handler exists for{llmqType, quorumIndex}, this node may simply not have initialized handlers for that quorum yet.Using 100 here risks instantly banning legitimate participants who are slightly ahead.
Proposed fix
if (!dispatched) { LogPrintf("NetDKG -- no session handlers for quorumIndex [%d]\n", quorumIndex); - m_peer_manager->PeerMisbehaving(pfrom.GetId(), 100); + // NOTE: do not insta-ban, we might be lagging behind + m_peer_manager->PeerMisbehaving(pfrom.GetId(), 10); return; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/llmq/net_dkg.cpp` around lines 375 - 379, The current NetDKG branch penalizes peers with m_peer_manager->PeerMisbehaving(pfrom.GetId(), 100) when dispatched is false; lower the penalty to match the earlier unknown quorumHash case (use 10) and keep the log message, because absence of a handler likely means we're lagging rather than malicious behavior—modify the call in the NetDKG handler where dispatched is false to use 10 instead of 100 (the symbols to change: variable quorumIndex, boolean dispatched, and method m_peer_manager->PeerMisbehaving).
384-411:⚠️ Potential issue | 🟡 Minor | ⚖️ Poor tradeoffAsymmetry between
AlreadyHaveandProcessGetDatafor observer-mode nodes.
AlreadyHave(line 391) only gates onIsQuorumDKGEnabled(m_sporkman)and walks every handler's pending sets, so observer nodes (m_quorums_watch=true,m_active=nullptr) will reporttruefor DKG inv hashes they've seen.
ProcessGetData(line 411) short-circuits tofalsewhenm_active == nullptr, meaning observer nodes will tell peers "I already have this" but then refuse to serve it viagetdata.Consider either:
- Removing the
m_active == nullptrearly return and relying on virtual dispatch to returnfalsenaturally, or- Symmetrically gating
AlreadyHaveto returnfalsewhenm_active == nullptr.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/llmq/net_dkg.cpp` around lines 384 - 411, AlreadyHave reports DKG invs as present for observer nodes while ProcessGetData refuses to serve them when m_active == nullptr, causing asymmetry; to fix, add a symmetric guard in NetDKG::AlreadyHave that returns false when m_active == nullptr (or when m_quorums_watch indicates observer mode) before calling IsQuorumDKGEnabled/ForEachHandler so observers won't claim they have items they won't serve; alternatively remove the early return in NetDKG::ProcessGetData so virtual handlers decide, but prefer adding the m_active/nullptr check in AlreadyHave to preserve current ProcessGetData behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@src/llmq/net_dkg.cpp`:
- Around line 375-379: The current NetDKG branch penalizes peers with
m_peer_manager->PeerMisbehaving(pfrom.GetId(), 100) when dispatched is false;
lower the penalty to match the earlier unknown quorumHash case (use 10) and keep
the log message, because absence of a handler likely means we're lagging rather
than malicious behavior—modify the call in the NetDKG handler where dispatched
is false to use 10 instead of 100 (the symbols to change: variable quorumIndex,
boolean dispatched, and method m_peer_manager->PeerMisbehaving).
- Around line 384-411: AlreadyHave reports DKG invs as present for observer
nodes while ProcessGetData refuses to serve them when m_active == nullptr,
causing asymmetry; to fix, add a symmetric guard in NetDKG::AlreadyHave that
returns false when m_active == nullptr (or when m_quorums_watch indicates
observer mode) before calling IsQuorumDKGEnabled/ForEachHandler so observers
won't claim they have items they won't serve; alternatively remove the early
return in NetDKG::ProcessGetData so virtual handlers decide, but prefer adding
the m_active/nullptr check in AlreadyHave to preserve current ProcessGetData
behavior.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 8ef86bf0-4e53-424c-a970-64325e97fd7d
📒 Files selected for processing (26)
src/Makefile.amsrc/active/context.cppsrc/active/context.hsrc/active/dkgsession.cppsrc/active/dkgsession.hsrc/active/dkgsessionhandler.cppsrc/active/dkgsessionhandler.hsrc/init.cppsrc/llmq/debug.cppsrc/llmq/debug.hsrc/llmq/dkgsession.cppsrc/llmq/dkgsession.hsrc/llmq/dkgsessionhandler.cppsrc/llmq/dkgsessionhandler.hsrc/llmq/dkgsessionmgr.cppsrc/llmq/dkgsessionmgr.hsrc/llmq/net_dkg.cppsrc/llmq/net_dkg.hsrc/llmq/observer.cppsrc/llmq/observer.hsrc/llmq/options.cppsrc/llmq/options.hsrc/net_processing.cppsrc/net_processing.hsrc/test/util/setup_common.cpptest/lint/lint-circular-dependencies.py
💤 Files with no reviewable changes (1)
- test/lint/lint-circular-dependencies.py
🚧 Files skipped from review as they are similar to previous changes (20)
- src/llmq/options.h
- src/llmq/observer.cpp
- src/Makefile.am
- src/llmq/debug.h
- src/llmq/observer.h
- src/llmq/options.cpp
- src/active/context.h
- src/llmq/dkgsessionmgr.h
- src/active/dkgsession.h
- src/active/dkgsessionhandler.h
- src/llmq/dkgsessionhandler.cpp
- src/llmq/dkgsessionmgr.cpp
- src/llmq/dkgsession.cpp
- src/llmq/dkgsession.h
- src/net_processing.h
- src/llmq/net_dkg.h
- src/init.cpp
- src/active/dkgsession.cpp
- src/llmq/dkgsessionhandler.h
- src/net_processing.cpp
thepastaclaw
left a comment
There was a problem hiding this comment.
Code Review
Refactor cleanly splits DKG network handling into NetDKG/NetDKGStub and reduces CDKGSessionManager to a state holder; no blocking issues found. The notable regression is that on the active-masternode path the startup tip is delivered to CDSNotificationInterface twice. Remaining items are minor cleanups and an architectural suggestion.
Reviewed commit: 22d5cec
🟡 3 suggestion(s) | 💬 3 nitpick(s)
6 finding(s) posted in the review body
suggestion: Active-node startup runs CDSNotificationInterface::InitializeCurrentBlockTip twice
src/init.cpp (lines 2451-2540)
On the active-masternode path the import thread first calls g_ds_notification_interface->InitializeCurrentBlockTip(tip, ibd) directly at line 2457, then later calls GetMainSignals().InitializeCurrentBlockTip(tip, ibd) at line 2540 after nodeman->Init(). Because g_ds_notification_interface is registered as a CValidationInterface at init.cpp:2213, the broadcast at line 2540 dispatches back through CDSNotificationInterface::InitializeCurrentBlockTip, which in turn invokes SynchronousUpdatedBlockTip and UpdatedBlockTip a second time for the same tip — re-running CDeterministicMNManager::UpdatedBlockTip, CMasternodeSync::UpdatedBlockTip, CDSTXManager::UpdatedBlockTip, and CGovernanceManager::UpdatedBlockTip. Either temporarily unregister g_ds_notification_interface around the second broadcast, or have the second call notify only the subscribers that depend on a valid proTxHash.
nitpick: Duplicate forward declarations in dkgsessionmgr.h
src/llmq/dkgsessionmgr.h (lines 35-45)
Five classes are forward-declared twice in this header block (CDKGSessionHandler, CDKGContribution, CDKGComplaint, CDKGJustification, CDKGPrematureCommitment). The second copies appear after class CQuorumSnapshotManager; and look like an accidental copy/paste from the include-cleanup commit. Legal but noisy.
💡 Suggested change
class CDKGComplaint;
class CDKGContribution;
class CDKGJustification;
class CDKGPrematureCommitment;
class CDKGSessionHandler;
class CQuorumSnapshotManager;
suggestion: DKG inv types now go through MSG_DSQ's cj_walletman lookup before reaching handlers
src/net_processing.cpp (lines 2378-2387)
The combined switch arm groups MSG_QUORUM_CONTRIB/COMPLAINT/JUSTIFICATION/PREMATURE_COMMITMENT with MSG_DSQ, so every DKG inv first calls m_cj_walletman->hasQueue(inv.hash) before iterating m_handlers. The hash spaces are independent in practice so a false positive is astronomically unlikely, but the structural coupling is misleading and the lookup is wasted work on every DKG inv (hot during DKG phases). Either split MSG_DSQ into its own case, or iterate m_handlers first and let the CoinJoin pre-check live behind MSG_DSQ.
suggestion: NetDKG has no destructor that joins m_phase_threads
src/llmq/net_dkg.cpp (lines 466-473)
NetDKG::Stop() joins m_phase_threads and is currently invoked through PeerManagerImpl::RemoveHandlers() before ~NetDKG runs. If a NetDKG is ever destroyed outside that PeerManager-owned lifecycle (tests, future refactors), the base ~NetHandler runs only the static Interrupt()/Stop() (derived vtable is gone), and the phase threads — which capture references to ActiveDKG-owned objects (dmnman, mn_metaman, qsnapman, connman) — outlive their captured state. A ~NetDKG() that calls Stop() explicitly would make this robust to lifecycle changes.
nitpick: Unreachable guard: NetDKG is never installed with !is_masternode && !m_quorums_watch
src/llmq/net_dkg.cpp (lines 278-282)
At init.cpp:2232-2245, NetDKG is only registered when active_ctx is set (active masternode, m_active != nullptr) or observer_ctx is set (observer mode, constructed with quorums_watch=true); everything else gets NetDKGStub. The !is_masternode && !m_quorums_watch branch therefore cannot be reached on a live NetDKG. Either drop it and rely on NetDKGStub, or add a short comment explaining it as defense-in-depth.
nitpick: No targeted tests for NetDKG/NetDKGStub misbehavior contract
src/llmq/net_dkg.h (lines 52-123)
Existing functional DKG tests cover the happy paths, but the new NetDKG/NetDKGStub split introduces distinct misbehavior scores (10 vs 100) depending on mode (stub vs observer-without-watch vs QWATCH-from-non-masternode). A small unit test exercising each PeerMisbehaving branch would prevent future regressions in the ban-score contract.
🤖 Prompt for all review comments with AI agents
These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.
- [SUGGESTION] In `src/init.cpp`:2451-2540: Active-node startup runs CDSNotificationInterface::InitializeCurrentBlockTip twice
On the active-masternode path the import thread first calls `g_ds_notification_interface->InitializeCurrentBlockTip(tip, ibd)` directly at line 2457, then later calls `GetMainSignals().InitializeCurrentBlockTip(tip, ibd)` at line 2540 after `nodeman->Init()`. Because `g_ds_notification_interface` is registered as a `CValidationInterface` at init.cpp:2213, the broadcast at line 2540 dispatches back through `CDSNotificationInterface::InitializeCurrentBlockTip`, which in turn invokes `SynchronousUpdatedBlockTip` and `UpdatedBlockTip` a second time for the same tip — re-running `CDeterministicMNManager::UpdatedBlockTip`, `CMasternodeSync::UpdatedBlockTip`, `CDSTXManager::UpdatedBlockTip`, and `CGovernanceManager::UpdatedBlockTip`. Either temporarily unregister `g_ds_notification_interface` around the second broadcast, or have the second call notify only the subscribers that depend on a valid proTxHash.
- [NITPICK] In `src/llmq/dkgsessionmgr.h`:35-45: Duplicate forward declarations in dkgsessionmgr.h
Five classes are forward-declared twice in this header block (CDKGSessionHandler, CDKGContribution, CDKGComplaint, CDKGJustification, CDKGPrematureCommitment). The second copies appear after `class CQuorumSnapshotManager;` and look like an accidental copy/paste from the include-cleanup commit. Legal but noisy.
- [SUGGESTION] In `src/net_processing.cpp`:2378-2387: DKG inv types now go through MSG_DSQ's cj_walletman lookup before reaching handlers
The combined switch arm groups MSG_QUORUM_CONTRIB/COMPLAINT/JUSTIFICATION/PREMATURE_COMMITMENT with MSG_DSQ, so every DKG inv first calls `m_cj_walletman->hasQueue(inv.hash)` before iterating `m_handlers`. The hash spaces are independent in practice so a false positive is astronomically unlikely, but the structural coupling is misleading and the lookup is wasted work on every DKG inv (hot during DKG phases). Either split MSG_DSQ into its own case, or iterate `m_handlers` first and let the CoinJoin pre-check live behind MSG_DSQ.
- [SUGGESTION] In `src/llmq/net_dkg.cpp`:466-473: NetDKG has no destructor that joins m_phase_threads
`NetDKG::Stop()` joins `m_phase_threads` and is currently invoked through `PeerManagerImpl::RemoveHandlers()` before `~NetDKG` runs. If a `NetDKG` is ever destroyed outside that PeerManager-owned lifecycle (tests, future refactors), the base `~NetHandler` runs only the static `Interrupt()/Stop()` (derived vtable is gone), and the phase threads — which capture references to `ActiveDKG`-owned objects (dmnman, mn_metaman, qsnapman, connman) — outlive their captured state. A `~NetDKG()` that calls `Stop()` explicitly would make this robust to lifecycle changes.
- [NITPICK] In `src/llmq/net_dkg.cpp`:278-282: Unreachable guard: NetDKG is never installed with !is_masternode && !m_quorums_watch
At init.cpp:2232-2245, NetDKG is only registered when `active_ctx` is set (active masternode, `m_active != nullptr`) or `observer_ctx` is set (observer mode, constructed with `quorums_watch=true`); everything else gets NetDKGStub. The `!is_masternode && !m_quorums_watch` branch therefore cannot be reached on a live NetDKG. Either drop it and rely on NetDKGStub, or add a short comment explaining it as defense-in-depth.
- [NITPICK] In `src/llmq/net_dkg.h`:52-123: No targeted tests for NetDKG/NetDKGStub misbehavior contract
Existing functional DKG tests cover the happy paths, but the new NetDKG/NetDKGStub split introduces distinct misbehavior scores (10 vs 100) depending on mode (stub vs observer-without-watch vs QWATCH-from-non-masternode). A small unit test exercising each PeerMisbehaving branch would prevent future regressions in the ban-score contract.
Inline posting failed; posted verified findings in the top-level review body.
Issue being fixed or feature implemented
This PR is continuous of #7247
This PR is not direct dependency of kernel project.
This PR aim to resolve next issues:
constructor of PeerManager uses references to unique_ptr to multiple objects that will be initialized later, such as:
That's a fragile design that has multiple assumptions about already initialized members and their life term
What was done?
CDKGSessionManageris reduced to a pure state class, it owns DB and provides 2 new helper:ForEachHandler/DoForHandlerCDKGSessionHandlerandActiveDKGSessionHandlerloses its threading andProcessMessagemembersMessageProcessingResultusages are dropped from llmq/ consensus codeNetDKGis introduced which takes responsibilities for p2p communications for DKG works and for running threadsHow Has This Been Tested?
Removed circular dependency over
dkgsessionhandler <-> net_processingBreaking Changes
N/A
Checklist: