-
Notifications
You must be signed in to change notification settings - Fork 2
Fix consensus routine synchronization on 4+ nodes #476
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…tary node + use fs promises instead of writeFileSync + include older transactions in getMempool when given blocknumber + optimize nested for loops in broadcastBlockHash.ts and peerGossip.ts + disable abort of SET_WAIT_STATUS + add yieldToEventLoop in mainLoop + move mainLoopSleepTime to sharedState + update waiter class to prehold waiter key with return value
…ound - require secretaryManager.getInstance to use a blockRef number + handle instance deletion + transmit block ref on greenlight + handle non existent instances + update Waiter.wait to allow waiting for key in multiple places
WalkthroughThis PR transitions many synchronous filesystem writes to async/await, introduces per-block SecretaryManager instances keyed by blockRef, updates consensus and mempool flows to be blockRef- and shared-state-driven, refines shard/seed computation, augments logging, adds event-loop yielding in main loop, and removes deprecated shared state fields. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Node as Node
participant Shared as SharedState
participant SecMgr as SecretaryManager (per blockRef)
participant Mempool as Mempool
participant Network as Network RPC
rect rgb(235,245,255)
note over Node,Shared: Determine next block reference
Node->>Shared: get lastBlockNumber
Shared-->>Node: N
Node->>Node: blockRef = N + 1
end
rect rgb(245,255,245)
note over Node,SecMgr: Initialize per-block secretary
Node->>SecMgr: getInstance(blockRef, initialize=true)
SecMgr-->>Node: instance
end
rect rgb(255,250,235)
note over Node,Mempool: Retrieve mempool for blockRef
Node->>Mempool: getMempool(blockRef)
Mempool-->>Node: transactions
end
rect rgb(245,245,255)
note over Network,SecMgr: Incoming consensus RPC (e.g., setValidatorPhase/greenlight)
Network->>Shared: getCommonValidatorSeed()
Network->>Node: getShard(seed)
alt validator in shard
Network->>SecMgr: getInstance(blockRef)
alt manager found
Network->>SecMgr: process request (blockRef-scoped)
SecMgr-->>Network: response
else manager missing
Network-->>Node: 200 with greenlight or soft error
end
else not in shard
Network-->>Node: 400 with diagnostics (except special cases)
end
end
sequenceDiagram
autonumber
participant Leader as Leader Node
participant Peers as Validators
participant Verify as Signature Verify
rect rgb(255,245,250)
note over Leader,Peers: Broadcast block hash and collect signatures (parallel)
Leader->>Peers: request signatures (incomingSignatures)
Leader->>Verify: Promise.all(verify(identity, sig)) for each entry
Verify-->>Leader: [{identity, signature, isValid}, ...]
Leader->>Leader: attach valid signatures to block.validation_data
Leader-->>Peers: proceed after all verifications
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
PR Compliance Guide 🔍Below is a summary of compliance checks for this PR:
Compliance status legend🟢 - Fully Compliant🟡 - Partial Compliant 🔴 - Not Compliant ⚪ - Requires Further Human Verification 🏷️ - Compliance label |
|||||||||||||||||||||
PR Code Suggestions ✨Explore these optional code suggestions:
|
|||||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
src/libs/consensus/v2/routines/broadcastBlockHash.ts (1)
59-89: Fix signature counting in broadcastBlockHashThe Promise.all refactor currently does a single
pro++and never updatescon. Update the code to:- const signatureVerificationPromises = Object.entries(incomingSignatures).map(async ([identity, signature]) => { … }) - await Promise.all(signatureVerificationPromises) - pro++ + const results = await Promise.all( + Object.entries(incomingSignatures).map(async ([identity, signature]) => { … }) + ) + const validCount = results.filter(r => r.isValid).length + const invalidCount = results.length - validCount + pro += validCount + con += invalidCountso both
proandconaccurately reflect valid and invalid signatures.src/libs/consensus/v2/routines/manageProposeBlockHash.ts (1)
24-39: Add null check for peer lookup.
PeerManager.getInstance().getPeer(peerId)on line 29 could returnundefinedif the peer is not found. Usingpeer.connection.stringon line 33 without a null check will cause a runtime error.Apply this diff to add a null check:
const validator = shard.find(validator => validator.identity === peerId) const peer = PeerManager.getInstance().getPeer(peerId) if (!validator) { + const connectionStr = peer?.connection?.string || peerId log.error( "[manageProposeBlockHash] Validator (" + - peer.connection.string + + connectionStr + ") is not in the shard: refusing the block hash", ) response.result = 401 response.response = getSharedState.publicKeyHex response.extra = "Validator is not in the shard" return response }src/utilities/logger.ts (1)
89-110: Prevent unhandled rejections inLogger.custom.Marking
customasasyncmeans every caller now receives a Promise they never await. Iffs.promises.writeFilerejects (disk full, EACCES, etc.), the Promise rejects without handlers, and Node 18+ treats that as a fatal unhandled rejection. Wrap the awaited call socustomnever propagates a rejection.- if (cleanFile) { - fs.rmSync(this.LOG_CUSTOM_PREFIX + logfile + ".log", { - force: true, - }) - await fs.promises.writeFile(this.LOG_CUSTOM_PREFIX + logfile + ".log", "") - } + if (cleanFile) { + try { + fs.rmSync(this.LOG_CUSTOM_PREFIX + logfile + ".log", { + force: true, + }) + await fs.promises.writeFile( + this.LOG_CUSTOM_PREFIX + logfile + ".log", + "", + ) + } catch (error) { + console.error( + `[Logger.custom] Failed to reset ${logfile} log file:`, + error, + ) + } + }
🧹 Nitpick comments (5)
src/utilities/sharedState.ts (1)
33-34: Respect explicit zero MAIN_LOOP_SLEEP_TIME values.Using
parseInt(...) || 1000turns an explicitMAIN_LOOP_SLEEP_TIME=0into 1000, making it impossible to disable or minimize the sleep. Please treat0as a valid value and only fall back when the env var is missing or NaN.- mainLoopSleepTime = parseInt(process.env.MAIN_LOOP_SLEEP_TIME) || 1000 // 1 second + private readonly defaultMainLoopSleep = 1000 + mainLoopSleepTime = (() => { + const parsed = Number(process.env.MAIN_LOOP_SLEEP_TIME) + return Number.isFinite(parsed) ? parsed : this.defaultMainLoopSleep + })() // millisecondssrc/libs/network/manageConsensusRoutines.ts (1)
97-118: Simplify/clarify not-in-shard gating; avoid labeled breakThe label/break on an if is non-idiomatic and easy to misread. A simple guard is clearer.
- log.debug("isInShard: " + isInShard) - - inShardCheck: if (!isInShard) { - // INFO: If is a greenlight request, return 200 - if (payload.method == "greenlight") { - // response.result = 200 - // response.response = "Greenlight received too late, ignoring" - - // return response - break inShardCheck - } - - if (payload.method == "setValidatorPhase") { - // response.result = 200 - // response.response = - // "Set validator phase received too late, ignoring" - // response.extra = { - // greenlight: true, - // } - // return response - break inShardCheck - } - - response.result = 400 + log.debug("isInShard: " + isInShard) + if (!isInShard && payload.method !== "greenlight" && payload.method !== "setValidatorPhase") { + response.result = 400 response.response = "We are not in the shard(" + getSharedState.exposedUrl + "), cannot proceed with the routine" log.error("🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒") log.error("Payload: " + JSON.stringify(payload, null, 2)) log.error( "We are not in the shard(" + getSharedState.exposedUrl + "), cannot proceed with the routine", ) log.error("current validator seed: " + commonValidatorSeed) log.error( "calculated shard: " + JSON.stringify( shard.map(m => m.connection.string), null, 2, ), ) const sharedStateLastShard = shard.map(m => m.connection.string) log.error( "shared state last shard: " + JSON.stringify(sharedStateLastShard, null, 2), ) log.error("last block number: " + getSharedState.lastBlockNumber) log.error("🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒🚒") - // INFO: Check if seed is from past consensus round + // INFO: Check if seed is from past consensus round // const lastBlockMinus1 = await Chain.getBlockByNumber(getSharedState.lastBlockNumber - 1) // const { commonValidatorSeed: pastCommonValidatorSeed } = await getCommonValidatorSeed(lastBlockMinus1) // if (pastCommonValidatorSeed == commonValidatorSeed) { // log.error("Seed is from past consensus round") // response.result = 400 // response.response = "Seed is from past consensus round" // return response // } return response - } + }Also applies to: 120-161
src/libs/consensus/v2/types/secretaryManager.ts (3)
790-802: Resolve waiter when secretary is absent but greenlight is grantedWhen sendStatus returns result 500 with extra.greenlight=true (no secretary), handleSendStatusRes logs and does not resolve the waiter, causing timeouts/hangs.
- if (res.result == 500 || res.result == 400) { + if (res.result == 500 || res.result == 400) { log.debug( "[SEND OUR VALIDATOR PHASE] Error sending the setValidatorPhase request", ) log.debug("Response: " + JSON.stringify(res, null, 2)) - // REVIEW: How should we handle this? + // Handle greenlight provided in error responses (e.g., peer behind/no manager) + if (res.extra?.greenlight) { + const ts = res.extra?.timestamp ?? null + if (Waiter.isWaiting(waiterKey)) { + return Waiter.resolve<number>(waiterKey, ts) + } + return ts + } // REVIEW: How should we handle this? // NOTE: A 400 is returned if the block reference is // lower than the secretary's block reference // await this.handleSecretaryGoneOffline() // await sendStatus() }Also applies to: 824-839
640-653: Guard timestamp comparisons when blockTimestamp is nullAvoid null-coercion surprises; compare only when current blockTimestamp is set.
- if (secretaryBlockTimestamp < this.blockTimestamp) { + if (this.blockTimestamp != null && secretaryBlockTimestamp < this.blockTimestamp) { log.debug( "Greenlight received for an older block,returning false ...", ) return false } - if ( - secretaryBlockTimestamp && - secretaryBlockTimestamp > this.blockTimestamp - ) { + if ( + secretaryBlockTimestamp != null && + (this.blockTimestamp == null || + secretaryBlockTimestamp > this.blockTimestamp) + ) { this.blockTimestamp = secretaryBlockTimestamp }
881-923: Avoid process.exit in library cleanup pathHard exits in endConsensusRoutine can kill the node under transient conditions. Prefer propagating errors or safe-guarded returns.
- Replace process.exit(1) calls with error logs and safe returns.
- Consider aborting outstanding waiters instead of blocking on Promise.all indefinitely.
Would you like a patch to convert these to non-fatal paths?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (28)
src/index.ts(1 hunks)src/libs/blockchain/gcr/gcr.ts(2 hunks)src/libs/blockchain/mempool_v2.ts(6 hunks)src/libs/blockchain/routines/Sync.ts(3 hunks)src/libs/consensus/routines/consensusTime.ts(3 hunks)src/libs/consensus/v2/PoRBFT.ts(14 hunks)src/libs/consensus/v2/routines/broadcastBlockHash.ts(3 hunks)src/libs/consensus/v2/routines/createBlock.ts(1 hunks)src/libs/consensus/v2/routines/getCommonValidatorSeed.ts(1 hunks)src/libs/consensus/v2/routines/getShard.ts(1 hunks)src/libs/consensus/v2/routines/manageProposeBlockHash.ts(2 hunks)src/libs/consensus/v2/types/secretaryManager.ts(9 hunks)src/libs/identity/identity.ts(1 hunks)src/libs/identity/tools/twitter.ts(3 hunks)src/libs/network/manageConsensusRoutines.ts(8 hunks)src/libs/peer/Peer.ts(1 hunks)src/libs/peer/PeerManager.ts(1 hunks)src/libs/peer/routines/peerGossip.ts(2 hunks)src/libs/utils/keyMaker.ts(2 hunks)src/model/datasource.ts(0 hunks)src/utilities/backupAndRestore.ts(1 hunks)src/utilities/cli_libraries/wallet.ts(3 hunks)src/utilities/commandLine.ts(1 hunks)src/utilities/logger.ts(2 hunks)src/utilities/mainLoop.ts(6 hunks)src/utilities/selfPeer.ts(1 hunks)src/utilities/sharedState.ts(3 hunks)src/utilities/waiter.ts(6 hunks)
💤 Files with no reviewable changes (1)
- src/model/datasource.ts
🧰 Additional context used
🧬 Code graph analysis (10)
src/libs/blockchain/routines/Sync.ts (1)
src/utilities/sharedState.ts (1)
getSharedState(265-267)
src/libs/consensus/routines/consensusTime.ts (1)
src/utilities/sharedState.ts (1)
getSharedState(265-267)
src/utilities/mainLoop.ts (5)
src/utilities/sharedState.ts (1)
getSharedState(265-267)src/libs/peer/routines/checkOfflinePeers.ts (1)
checkOfflinePeers(6-31)src/libs/peer/routines/peerGossip.ts (1)
peerGossip(28-39)src/libs/blockchain/routines/Sync.ts (1)
fastSync(529-544)src/libs/consensus/v2/PoRBFT.ts (1)
consensusRoutine(58-251)
src/libs/consensus/v2/PoRBFT.ts (6)
src/utilities/sharedState.ts (1)
getSharedState(265-267)src/libs/consensus/v2/types/secretaryManager.ts (2)
SecretaryManager(15-1004)initializeShard(49-101)src/libs/consensus/v2/routines/getCommonValidatorSeed.ts (1)
getCommonValidatorSeed(58-132)src/libs/blockchain/mempool_v2.ts (1)
Mempool(18-206)src/libs/consensus/v2/routines/mergeMempools.ts (1)
mergeMempools(7-34)src/exceptions/index.ts (1)
ForgingEndedError(42-47)
src/utilities/commandLine.ts (1)
src/utilities/cli_libraries/wallet.ts (1)
Wallet(10-134)
src/libs/blockchain/mempool_v2.ts (2)
src/utilities/sharedState.ts (1)
getSharedState(265-267)src/libs/consensus/v2/types/secretaryManager.ts (1)
SecretaryManager(15-1004)
src/libs/identity/identity.ts (1)
src/utilities/sharedState.ts (1)
getSharedState(265-267)
src/libs/consensus/v2/types/secretaryManager.ts (4)
src/exceptions/index.ts (1)
NotInShardError(35-40)src/utilities/waiter.ts (1)
Waiter(25-148)src/utilities/logger.ts (1)
error(125-132)src/utilities/sharedState.ts (1)
getSharedState(265-267)
src/libs/consensus/v2/routines/manageProposeBlockHash.ts (3)
src/libs/consensus/v2/routines/getCommonValidatorSeed.ts (1)
getCommonValidatorSeed(58-132)src/libs/consensus/v2/routines/getShard.ts (1)
getShard(8-57)src/libs/peer/PeerManager.ts (1)
PeerManager(20-423)
src/libs/network/manageConsensusRoutines.ts (4)
src/libs/peer/PeerManager.ts (1)
PeerManager(20-423)src/libs/consensus/v2/routines/getCommonValidatorSeed.ts (1)
getCommonValidatorSeed(58-132)src/libs/consensus/v2/routines/getShard.ts (1)
getShard(8-57)src/libs/consensus/v2/types/secretaryManager.ts (1)
SecretaryManager(15-1004)
🔇 Additional comments (15)
src/libs/consensus/v2/routines/createBlock.ts (1)
10-10: Import hookup looks good.The new encryption helpers integrate cleanly with the existing signing flow; no concerns.
src/libs/peer/PeerManager.ts (1)
44-44: LGTM!Removing the explicit type annotation is fine here since TypeScript can infer the type from the string literal initializer.
src/libs/identity/tools/twitter.ts (1)
476-476: LGTM!The change from
log.onlytolog.debugis appropriate for delayed request logging.src/utilities/cli_libraries/wallet.ts (1)
35-39: LGTM!The async changes are properly implemented:
dispatchandsaveare now correctly marked as async- The
savecall is properly awaited in the dispatch method- The async file write aligns with the broader I/O migration in this PR
Also applies to: 84-84, 123-124
src/utilities/commandLine.ts (1)
41-41: LGTM!Properly awaits the now-async
dispatchmethod, aligning with the changes inwallet.ts.src/index.ts (1)
245-248: LGTM!The migration to async file writing is correctly implemented and aligns with the broader async I/O changes in this PR.
src/libs/blockchain/routines/Sync.ts (1)
253-253: LGTM!The additional debug logging will help troubleshoot block download and insertion flows without impacting functionality.
Also applies to: 268-268, 295-296
src/libs/identity/identity.ts (1)
147-149: LGTM!The conversion from synchronous
fs.writeFileSyncto asynchronousfs.promises.writeFileis correct and aligns with the broader async I/O pattern in this PR. The function is already async, so awaiting the write operation is appropriate.src/utilities/backupAndRestore.ts (1)
89-93: LGTM!The async file write operation is correctly implemented. The function is already async and has appropriate error handling via the surrounding try-catch block.
src/libs/peer/Peer.ts (1)
165-169: LGTM!Adding the peer connection string to the error log provides valuable context for debugging. The connection string is always available via
this.connection.string, so there's no risk of undefined access.src/libs/utils/keyMaker.ts (2)
29-29: LGTM!The conversion to async file deletion is correct. The function is already async, and the await ensures proper sequencing.
41-42: LGTM!The async file write operations are correctly implemented with proper awaits. This ensures non-blocking I/O while maintaining proper sequencing.
src/libs/consensus/v2/routines/getCommonValidatorSeed.ts (1)
125-128: Removal ofcurrentValidatorSeedassignment is safe
No remaining references tocurrentValidatorSeedoutside the commented lines in getCommonValidatorSeed.ts.src/libs/blockchain/gcr/gcr.ts (1)
90-101: Verify alladdcall sites handle the async signatureNo occurrences of
new OperationsRegistryor awaited calls to.add()were found insrc/libs/blockchain/gcr. Search the entire codebase fornew OperationsRegistryand.add(), and ensure every call to.add()usesawaitor otherwise handles the returned promise.src/libs/consensus/routines/consensusTime.ts (1)
32-32: Dismiss incorrect concern about missing fallback
getConsensusTime() internally falls back toblock_time(10) whenprocess.env.CONSENSUS_TIMEis unset or invalid, so it always returns a valid number.Likely an incorrect or invalid review comment.
| } from "src/exceptions" | ||
| import HandleGCR from "src/libs/blockchain/gcr/handleGCR" | ||
| import { GCREdit } from "@kynesyslabs/demosdk/types" | ||
| import { Waiter } from "@/utilities/waiter" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
Unify Waiter import path to avoid duplicate singletons
This file imports Waiter via "@/utilities/waiter" while others use "src/utilities/waiter". Mixed aliases can create duplicate module instances if paths aren’t normalized, breaking shared static state.
Please standardize on one path (e.g., "src/utilities/waiter").
To verify current usages:
🏁 Script executed:
#!/bin/bash
# Find all Waiter imports and their paths
rg -nP -C2 'import\s+{?\s*Waiter\s*}?\s+from\s+[\'"].+utilities\/waiter[\'"]'Length of output: 68
🏁 Script executed:
#!/bin/bash
# Find all Waiter imports and their paths with corrected quoting
rg -nP -C2 "import\s+{?\s*Waiter\s*}?\s+from\s+['\"].*utilities\/waiter['\"]"Length of output: 1335
Standardize Waiter import to "src/utilities/waiter"
PoRBFT.ts (line 27) imports from "@/utilities/waiter", while manageConsensusRoutines.ts (line 17) and secretaryManager.ts (line 7) use "src/utilities/waiter". Mixed aliases can load duplicate singletons—update all imports to "src/utilities/waiter".
🤖 Prompt for AI Agents
In src/libs/consensus/v2/PoRBFT.ts around line 27, the Waiter import uses the
alias "@/utilities/waiter" which conflicts with other modules that import from
"src/utilities/waiter" and can create duplicate singletons; update the import to
use the canonical path "src/utilities/waiter" (replace the "@/utilities/waiter"
import) so all modules import Waiter from the same module path.
| cleanupConsensusState() | ||
| // Joining the temporary mempool to the main one | ||
| // await Mempool.joinTemporaryMempool() // ? Is await ok here? | ||
| manager.endConsensusRoutine() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guard and await cleanup; avoid crash when not-in-shard
If initializeShard throws NotInShardError, manager.shard is undefined; calling endConsensusRoutine() will crash. Also, not awaiting can leave dangling state.
- } finally {
- cleanupConsensusState()
- manager.endConsensusRoutine()
- }
+ } finally {
+ cleanupConsensusState()
+ if (manager?.shard) {
+ await manager.endConsensusRoutine()
+ }
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| cleanupConsensusState() | |
| // Joining the temporary mempool to the main one | |
| // await Mempool.joinTemporaryMempool() // ? Is await ok here? | |
| manager.endConsensusRoutine() | |
| } | |
| } finally { | |
| cleanupConsensusState() | |
| if (manager?.shard) { | |
| await manager.endConsensusRoutine() | |
| } | |
| } |
🤖 Prompt for AI Agents
In src/libs/consensus/v2/PoRBFT.ts around lines 248 to 250, ensure cleanup and
shutdown are awaited and guarded: await cleanupConsensusState(); then only call
and await manager.endConsensusRoutine() if manager.shard is defined (e.g. if
(manager.shard) await manager.endConsensusRoutine()); this prevents calling
endConsensusRoutine when initializeShard failed with NotInShardError and avoids
leaving dangling async work.
| private _greenlight_timeout = 30_000 // 15 seconds | ||
| private _set_validator_phase_timeout = 15_000 // 10 seconds | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Timeout comments don’t match values
30_000 is 30s (not 15s). 15_000 is 15s (not 10s). Update comments to avoid confusion.
- private _greenlight_timeout = 30_000 // 15 seconds
- private _set_validator_phase_timeout = 15_000 // 10 seconds
+ private _greenlight_timeout = 30_000 // 30 seconds
+ private _set_validator_phase_timeout = 15_000 // 15 seconds📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| private _greenlight_timeout = 30_000 // 15 seconds | |
| private _set_validator_phase_timeout = 15_000 // 10 seconds | |
| private _greenlight_timeout = 30_000 // 30 seconds | |
| private _set_validator_phase_timeout = 15_000 // 15 seconds |
🤖 Prompt for AI Agents
In src/libs/consensus/v2/types/secretaryManager.ts around lines 16 to 18, the
inline comments for the timeout constants are incorrect: 30_000 is 30 seconds
(not 15s) and 15_000 is 15 seconds (not 10s). Update the comments to match the
actual values (e.g., change "// 15 seconds" to "// 30 seconds" and "// 10
seconds" to "// 15 seconds") so the constant declarations accurately document
their durations.
| // ANCHOR Singleton logic | ||
| public static getInstance(): SecretaryManager { | ||
| if (!SecretaryManager.instance) { | ||
| SecretaryManager.instance = new SecretaryManager() | ||
| public static getInstance( | ||
| blockRef?: number, | ||
| initialize = false, | ||
| ): SecretaryManager { | ||
| // INFO: If blockRef is not provided, use the last block number + 1 | ||
| // ie. assume we're using this instance for latest block | ||
| if (!blockRef) { | ||
| blockRef = getSharedState.lastBlockNumber + 1 | ||
| } | ||
|
|
||
| if (!SecretaryManager.instances.get(blockRef)) { | ||
| if (initialize) { | ||
| SecretaryManager.instances.set(blockRef, new SecretaryManager()) | ||
| } else { | ||
| return null | ||
| } | ||
| } | ||
|
|
||
| return SecretaryManager.instance | ||
| return SecretaryManager.instances.get(blockRef) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix getInstance return type and falsy blockRef handling
Method returns null but is typed as SecretaryManager. Also 0/undefined blockRef collapses; use nullish check.
- public static getInstance(
- blockRef?: number,
- initialize = false,
- ): SecretaryManager {
- // INFO: If blockRef is not provided, use the last block number + 1
- // ie. assume we're using this instance for latest block
- if (!blockRef) {
- blockRef = getSharedState.lastBlockNumber + 1
- }
+ public static getInstance(
+ blockRef?: number,
+ initialize = false,
+ ): SecretaryManager | null {
+ // INFO: If blockRef is not provided, use the last block number + 1
+ // ie. assume we're using this instance for latest block
+ if (blockRef == null) {
+ blockRef = getSharedState.lastBlockNumber + 1
+ }
if (!SecretaryManager.instances.get(blockRef)) {
if (initialize) {
SecretaryManager.instances.set(blockRef, new SecretaryManager())
} else {
return null
}
}
return SecretaryManager.instances.get(blockRef)
}🤖 Prompt for AI Agents
src/libs/consensus/v2/types/secretaryManager.ts around lines 958 to 978, the
getInstance method incorrectly types its return as SecretaryManager while
returning null in some paths and treats falsy blockRef (0) as absent; update the
signature to return SecretaryManager | null, accept blockRef?: number | null,
and replace the falsy check with a nullish check (if (blockRef == null) {
blockRef = getSharedState.lastBlockNumber + 1 }) so that 0 is treated as a valid
blockRef, and ensure all callers handle the nullable return.
| await fs.promises.writeFile( | ||
| `data/twitter/${userId}.json`, | ||
| JSON.stringify(res.data, null, 2), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
🧩 Analysis chain
Verify directory existence before writing files.
The async file writes look correct, but there's no check to ensure the data/twitter/ directory exists before writing. If the directory doesn't exist, the write operations will fail.
Run the following script to check if the directory is created elsewhere:
Also applies to: 546-549
🏁 Script executed:
#!/bin/bash
# Search for directory creation logic for data/twitter/
rg -nP -C3 'data/twitter|fs\.(mkdir|mkdirSync|ensureDir)' --type=tsLength of output: 2824
Ensure data/twitter directory exists before writing files. Insert before each fs.promises.writeFile (lines 527–530 and 546–549) a call such as:
const filePath = `data/twitter/${userId}.json`
await fs.promises.mkdir(path.dirname(filePath), { recursive: true })
await fs.promises.writeFile(filePath, JSON.stringify(res.data, null, 2))🤖 Prompt for AI Agents
In src/libs/identity/tools/twitter.ts around lines 527–530 and 546–549, the code
writes JSON files to data/twitter without ensuring the directory exists; before
each fs.promises.writeFile compute the target filePath (e.g.
`data/twitter/${userId}.json`), call fs.promises.mkdir(path.dirname(filePath), {
recursive: true }) to create the directory if missing, then call
fs.promises.writeFile(filePath, JSON.stringify(res.data, null, 2)); also ensure
the file uses/has an import for path (import path from 'path') if not already
present.
| const peer = PeerManager.getInstance().getPeer(sender) | ||
| log.debug("Sender: " + peer.connection.string) | ||
| log.debug("Payload: " + JSON.stringify(payload, null, 2)) | ||
| log.debug("-----------------------------") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix potential null deref when logging peer
getPeer(sender) may return undefined; accessing peer.connection.string will throw.
- const peer = PeerManager.getInstance().getPeer(sender)
- log.debug("Sender: " + peer.connection.string)
- log.debug("Payload: " + JSON.stringify(payload, null, 2))
+ const peer = PeerManager.getInstance().getPeer(sender)
+ log.debug("Sender: " + (peer?.connection?.string ?? sender))
+ log.debug("Payload: " + JSON.stringify(payload, null, 2))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const peer = PeerManager.getInstance().getPeer(sender) | |
| log.debug("Sender: " + peer.connection.string) | |
| log.debug("Payload: " + JSON.stringify(payload, null, 2)) | |
| log.debug("-----------------------------") | |
| const peer = PeerManager.getInstance().getPeer(sender) | |
| log.debug("Sender: " + (peer?.connection?.string ?? sender)) | |
| log.debug("Payload: " + JSON.stringify(payload, null, 2)) | |
| log.debug("-----------------------------") |
🤖 Prompt for AI Agents
In src/libs/network/manageConsensusRoutines.ts around lines 43 to 46, the code
assumes PeerManager.getInstance().getPeer(sender) always returns a peer and
directly accesses peer.connection.string; update the logging to guard against
undefined by checking peer and connection before accessing string (e.g., use an
if-check or optional chaining) and fall back to logging the sender ID or a
descriptive placeholder when peer or connection is missing; keep the payload
JSON.stringify log as-is but ensure it still executes even when peer is
undefined.
| response.result = 200 | ||
| await getCommonValidatorSeed() // NOTE This is generated each time and stored in the shared state | ||
| response.response = getSharedState.currentValidatorSeed | ||
| response.response = commonValidatorSeed | ||
| break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return the freshly computed commonValidatorSeed (not a stale outer variable)
Currently you compute a new seed but return the earlier one captured outside the switch.
- case "getCommonValidatorSeed":
- response.result = 200
- await getCommonValidatorSeed() // NOTE This is generated each time and stored in the shared state
- response.response = commonValidatorSeed
+ case "getCommonValidatorSeed":
+ response.result = 200
+ {
+ const { commonValidatorSeed } = await getCommonValidatorSeed()
+ response.response = commonValidatorSeed
+ }
break📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| response.result = 200 | |
| await getCommonValidatorSeed() // NOTE This is generated each time and stored in the shared state | |
| response.response = getSharedState.currentValidatorSeed | |
| response.response = commonValidatorSeed | |
| break | |
| case "getCommonValidatorSeed": | |
| response.result = 200 | |
| { | |
| const { commonValidatorSeed } = await getCommonValidatorSeed() | |
| response.response = commonValidatorSeed | |
| } | |
| break |
🤖 Prompt for AI Agents
In src/libs/network/manageConsensusRoutines.ts around lines 223 to 226, the code
calls await getCommonValidatorSeed() but then sets response.response to a stale
outer-scope commonValidatorSeed; instead, capture the freshly computed seed by
assigning const commonValidatorSeed = await getCommonValidatorSeed() (or a let
if mutation needed) and set response.response = commonValidatorSeed so the
returned value is the newly generated seed.
| const [blockRef, timestamp, validatorPhase] = payload.params as [ | ||
| number, // blockRef | ||
| number, // timestamp | ||
| number, // validatorPhase | ||
| ] | ||
| log.info( | ||
| "payload.params: " + JSON.stringify(payload.params, null, 2), | ||
| ) | ||
| const manager = SecretaryManager.getInstance() | ||
|
|
||
| log.debug("Our secretary identity: " + manager.secretary.identity) | ||
| log.debug("shard: " + manager.shard.members.map(m => m.identity)) | ||
| const manager = SecretaryManager.getInstance(blockRef) | ||
|
|
||
| // INFO: If the manager class for that block is not found, assume peer is behind on the consensus | ||
| // return a 200 to unblock peer | ||
| if (!manager) { | ||
| log.debug("returning a fake 200") | ||
| response.result = 200 | ||
| response.response = "Secretary manager not found" | ||
| return response | ||
| } | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Await greenlight; current code always 200 due to truthy Promise
receiveGreenLight is async; without await, response.result is always 200.
- const greenLightReceived = manager.receiveGreenLight(
- timestamp,
- validatorPhase,
- )
- response.result = greenLightReceived ? 200 : 400
- response.response = greenLightReceived
+ const ok = await manager.receiveGreenLight(timestamp, validatorPhase)
+ response.result = ok ? 200 : 400
+ response.response = ok
? `Greenlight for phase: ${validatorPhase} received with block timestamp: ${timestamp}`
: "Error receiving greenlight"Also applies to: 374-383
🤖 Prompt for AI Agents
In src/libs/network/manageConsensusRoutines.ts around lines 347-363 (and
similarly 374-383), the code calls the async receiveGreenLight without awaiting
it so the Promise is treated as truthy and the branch always sets
response.result = 200; fix by awaiting the async call and handling its resolved
value properly: change calls like const manager =
SecretaryManager.getInstance(blockRef) and manager.receiveGreenLight(...) to
await the promise (e.g., const result = await manager.receiveGreenLight(...)) or
await manager initialization as needed, then check the actual boolean/response
to decide whether to set response.result = 200 or propagate the real result;
also add try/catch around awaits to log and return appropriate error responses.
| fs.rmSync(basicPeerScriptFile) | ||
| fs.writeFileSync(basicPeerScriptFile, basicPeerScript) | ||
| await fs.promises.writeFile(basicPeerScriptFile, basicPeerScript) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for file removal.
Line 14's fs.rmSync will throw an error if the file doesn't exist. Since this is a bootstrap function (comment on line 3 indicates it's used when generating new identity with no valid peers), the file may not exist on first run.
Apply this diff to handle the case gracefully:
- fs.rmSync(basicPeerScriptFile)
+ try {
+ fs.rmSync(basicPeerScriptFile)
+ } catch (error) {
+ // File doesn't exist, which is fine
+ if (!(error instanceof Error && error.message.includes("ENOENT"))) {
+ throw error
+ }
+ }
await fs.promises.writeFile(basicPeerScriptFile, basicPeerScript)Alternatively, use fs.rmSync(basicPeerScriptFile, { force: true }) to suppress errors if the file doesn't exist.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fs.rmSync(basicPeerScriptFile) | |
| fs.writeFileSync(basicPeerScriptFile, basicPeerScript) | |
| await fs.promises.writeFile(basicPeerScriptFile, basicPeerScript) | |
| try { | |
| fs.rmSync(basicPeerScriptFile) | |
| } catch (error) { | |
| // File doesn't exist, which is fine | |
| if (!(error instanceof Error && error.message.includes("ENOENT"))) { | |
| throw error | |
| } | |
| } | |
| await fs.promises.writeFile(basicPeerScriptFile, basicPeerScript) |
🤖 Prompt for AI Agents
In src/utilities/selfPeer.ts around lines 14 to 15, the call to
fs.rmSync(basicPeerScriptFile) will throw if the file does not exist; update the
removal to handle missing files gracefully by using
fs.rmSync(basicPeerScriptFile, { force: true }) or wrap the removal in a
try/catch that ignores ENOENT and rethrows other errors, then proceed to await
fs.promises.writeFile(basicPeerScriptFile, basicPeerScript).
PR Type
Bug fix, Enhancement
Description
Fix consensus routine synchronization on 4+ nodes
Replace synchronous file operations with async promises
Optimize nested loops in consensus routines
Implement per-block SecretaryManager instances
Diagram Walkthrough
File Walkthrough
18 files
Replace writeFileSync with async promisesConvert operations registry to asyncUpdate mempool filtering and consensus checksAdd debug logging for block downloadsImprove consensus time logging formatOptimize signature verification with promisesReplace writeFileSync with async promisesConvert file operations to asyncImprove error logging with connection stringOptimize peerlist merging with batchingReplace sync file operations with asyncConvert writeFileSync to async promisesMake wallet operations asyncConvert custom logging to asyncAdd event loop yielding and configurable sleepReplace writeFileSync with async promisesAdd mainLoopSleepTime and remove lastShardUpdate waiter to support pre-held values8 files
Pass blockRef to SecretaryManager instancesComment out shared state updatesRemove shared state shard trackingRecalculate shard for validation checksImplement per-block instance management systemAdd extensive debugging and shard validationRemove duplicate GCRTracker importAdd await to wallet dispatch call2 files
Add missing import formattingFix variable declaration typeSummary by CodeRabbit
New Features
Performance
Bug Fixes
Logging