Optimize `scheduleBatchSweepTask` function #850

louisinger · 2025-12-02T20:18:04Z

This PR aims to optimize the scheduleBatchSweepTask function. Instead of depending on the VtxoTree struct, it relies on the vtxo tree root transaction id. Then, once the sweep is triggered, we deserialize and fetch the whole vtxo tree.

Thus, the scheduler does not need to keep in memory all the sweepable vtxo tree structs. It also saves bunch of PSBTs decoding while scheduling sweep.

related to #823

@sekulicd @altafan please review

Summary by CodeRabbit

New Features
- Exposed a method to obtain a transaction subtree's root transaction ID directly.
Refactor
- Sweep scheduling and related task creation now use root transaction IDs instead of full tree objects, simplifying execution flow and reducing runtime bookkeeping.
- Expiration updates now target subtrees by root ID.
Tests
- Added serialization/roundtrip tests validating leaf integrity and correct root identification.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-02T20:18:14Z

Walkthrough

Switched sweeper scheduling to pass vtxo subtree root TXID strings instead of full TxTree objects. Sweeper tasks lazily resolve subtrees at execution by loading round data. Added FlatTxTree.RootTxid() and extended serialization tests to assert root/leaf consistency.

Changes

Cohort / File(s)	Summary
Sweep scheduling refactor `internal/core/application/service.go`, `internal/core/application/sweeper.go`	Replaced `*tree.TxTree` parameters with `vtxoTreeRootTxid string`. Updated `scheduleBatchSweep`, `createBatchSweepTask`, and `updateVtxoExpirationTime` signatures and call sites. Tasks now accept a root TXID, load round data at runtime, locate the subtree by root TXID, and schedule/sweep subtrees by root TXID. Removed in-memory TxTree captures and related nil checks.
Tree root ID helper `pkg/ark-lib/tree/tx_tree.go`	Added `func (c FlatTxTree) RootTxid() string` to compute and return the root transaction ID for a flat tree.
Tree serialization tests `pkg/ark-lib/tree/tx_tree_test.go`	Added assertions after serialization roundtrip to verify leaves map to deserialized leaf nodes (matching TxIDs, no children) and that the flat tree root TXID matches the deserialized root's unsigned TxID.

Sequence Diagram(s)

mermaid
sequenceDiagram
autonumber
actor Scheduler
participant Service
participant Sweeper
participant RoundStore
Scheduler->>Service: schedule sweep (uses FlatTxTree.RootTxid())
Service->>Sweeper: scheduleBatchSweep(expiration, commitmentTxid, vtxoTreeRootTxid)
Sweeper->>RoundStore: loadRound(commitmentTxid)
RoundStore-->>Sweeper: round data (flat vtxo tree)
Sweeper->>Sweeper: locate subtree by vtxoTreeRootTxid
alt subtree found
Sweeper->>Sweeper: create and execute sweep task for subtree
Sweeper->>RoundStore: updateVtxoExpirationTime(commitmentTxid, vtxoTreeRootTxid, exp)
RoundStore-->>Sweeper: ack
else not found
Sweeper-->>Service: return error / log failure
end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Inspect createBatchSweepTask for correct lazy loading and subtree lookup logic.
Confirm all call sites updated to new signatures; search for lingering *tree.TxTree captures.
Review RootTxid() for edge cases (single-node tree, malformed or cyclic input).
Verify error and logging paths when subtree lookup fails.

Possibly related issues

Reduce memory retention for scheduled sweep tasks #823 — Aligns with replacing captured TxTree pointers with root TXID strings to avoid retaining full VTXO trees in scheduled tasks.

Possibly related PRs

Optimize sweeper #816 — Substantially similar change: switching sweeper and scheduling to operate on vtxo subtree root TXIDs.
Fix sweeper scheduling after arkd restart #809 — Related adjustments around TxTree lookup utilities and sweeper scheduling logic.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Title check	⚠️ Warning	The title references `scheduleBatchSweepTask` but the main change is refactoring scheduleBatchSweep and related functions to use root TXID instead of VtxoTree objects; the title doesn't capture this core optimization.	Consider a more accurate title like 'Refactor sweep scheduling to use root TXID instead of VtxoTree objects' to better reflect the primary optimization.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

pkg/ark-lib/tree/tx_tree.go (1)
38-58: Consider handling edge cases for malformed trees.

The RootTxid() method assumes a well-formed tree with exactly one root. If multiple roots exist or no root is found, the current implementation returns the first non-child node or an empty string without signaling an error. Consider:

Logging a warning when no root is found (line 57) instead of silently returning an empty string

Validating that exactly one root exists, similar to how NewTxTree() validates at lines 113-119
 func (c FlatTxTree) RootTxid() string {
 	if len(c) == 1 {
 		return c[0].Txid
 	}
 
 	// the root is the node not being a child of another one
 	allchildren := make(map[string]struct{})
 	for _, node := range c {
 		for _, child := range node.Children {
 			allchildren[child] = struct{}{}
 		}
 	}
 
 	for _, node := range c {
 		if _, ok := allchildren[node.Txid]; !ok {
 			return node.Txid
 		}
 	}
 
-	return ""
+	// No root found - this indicates a malformed tree
+	return ""
 }

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7ad67c6 and 3b0223d.

📒 Files selected for processing (4)

internal/core/application/service.go (1 hunks)
internal/core/application/sweeper.go (6 hunks)
pkg/ark-lib/tree/tx_tree.go (1 hunks)
pkg/ark-lib/tree/tx_tree_test.go (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-08-28T08:21:01.170Z

Learnt from: louisinger
Repo: arkade-os/arkd PR: 686
File: internal/core/application/fraud.go:47-61
Timestamp: 2025-08-28T08:21:01.170Z
Learning: In reactToFraud function in internal/core/application/fraud.go, the goroutine that waits for confirmation and schedules checkpoint sweep should use context.Background() instead of the request context, as this is intentional design to decouple the checkpoint sweep scheduling from the request lifetime.

Applied to files:

internal/core/application/service.go
internal/core/application/sweeper.go

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Build and Scan
GitHub Check: unit tests
GitHub Check: integration tests

🔇 Additional comments (6)

pkg/ark-lib/tree/tx_tree_test.go (1)

73-83: LGTM!

The new test coverage for RootTxid() validates that the root TXID extracted from the flat tree matches the deserialized tree's root transaction ID. The test also verifies leaf consistency, ensuring the serialization roundtrip works correctly.

internal/core/application/service.go (1)

3449-3449: LGTM!

The change to pass round.VtxoTree.RootTxid() instead of the full tree object aligns with the PR objective to reduce memory usage in the scheduler. The full vtxo tree will now be lazily resolved during sweep execution.

internal/core/application/sweeper.go (4)

99-99: LGTM!

The change to pass flatVtxoTree.RootTxid() instead of the full tree object is consistent with the lazy tree resolution approach. The full tree will be reconstructed when the task executes.

327-349: LGTM!

The refactored scheduleBatchSweep signature correctly accepts the root transaction ID instead of the full tree. The task ID now uses the root TXID, which is appropriate since each subtree's root uniquely identifies the sweep task.

715-748: LGTM!

The refactored updateVtxoExpirationTime correctly implements the lazy tree resolution pattern. It loads the round's vtxo tree, locates the subtree by root TXID, and updates expiration times on the subtree's leaves. The error handling is appropriate.

398-424: Verify error handling for tree resolution.

The new tree resolution logic correctly defers loading and deserialization to task execution time. However, verify the following:

If tree.NewTxTree() fails at line 412, confirm that the scheduler logs the error appropriately when the task fails.

If the vtxoTree is not found at line 418, consider adding debug logging with context about which trees were available in the round to aid troubleshooting.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

internal/core/application/sweeper.go (2)
326-350: Consider validating vtxoTreeRootTxid is non-empty.

The function accepts a string parameter but doesn't validate it before using it as the task ID. An empty string would result in a task with an empty ID, which could cause issues with task deduplication logic in scheduleTask.
 func (s *sweeper) scheduleBatchSweep(
 	expirationTimestamp int64, commitmentTxid, vtxoTreeRootTxid string,
 ) error {
+	if vtxoTreeRootTxid == "" {
+		return fmt.Errorf("vtxoTreeRootTxid cannot be empty")
+	}
 	if err := s.scheduleTask(sweeperTask{
717-750: Consider extracting common tree loading logic.

This function duplicates the tree loading pattern from createBatchSweepTask (lines 406-421):

Fetch flat tree from repository

Construct TxTree with tree.NewTxTree()

Find subtree with Find(vtxoTreeRootTxid)

A helper could reduce duplication:
func (s *sweeper) loadVtxoSubtree(ctx context.Context, commitmentTxid, vtxoTreeRootTxid string) (*tree.TxTree, error) {
    flatTree, err := s.repoManager.Rounds().GetRoundVtxoTree(ctx, commitmentTxid)
    if err != nil {
        return nil, err
    }
    roundTree, err := tree.NewTxTree(flatTree)
    if err != nil {
        return nil, err
    }
    subTree := roundTree.Find(vtxoTreeRootTxid)
    if subTree == nil {
        return nil, fmt.Errorf("vtxo tree %s not found in round %s", vtxoTreeRootTxid, commitmentTxid)
    }
    return subTree, nil
}

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3b0223d and 919a3fb.

📒 Files selected for processing (1)

internal/core/application/sweeper.go (9 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-08-28T08:21:01.170Z

Learnt from: louisinger
Repo: arkade-os/arkd PR: 686
File: internal/core/application/fraud.go:47-61
Timestamp: 2025-08-28T08:21:01.170Z
Learning: In reactToFraud function in internal/core/application/fraud.go, the goroutine that waits for confirmation and schedules checkpoint sweep should use context.Background() instead of the request context, as this is intentional design to decouple the checkpoint sweep scheduling from the request lifetime.

Applied to files:

internal/core/application/sweeper.go

🧬 Code graph analysis (1)

internal/core/application/sweeper.go (1)

pkg/ark-lib/tree/tx_tree.go (1)

NewTxTree (71-135)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: integration tests
GitHub Check: unit tests

🔇 Additional comments (3)

internal/core/application/sweeper.go (3)

98-102: LGTM - Correctly implements lazy loading pattern.

The change to pass flatVtxoTree.RootTxid() instead of constructing the full TxTree achieves the memory optimization goal. The tree will be reconstructed at execution time in createBatchSweepTask.

400-421: LGTM - Lazy loading correctly implemented.

The tree is now reconstructed at execution time from round data, achieving the memory optimization goal. The Find() method correctly locates the subtree by root TXID.

Note that if multiple subtrees from the same round are scheduled, the round data and tree reconstruction will occur for each subtree independently. This is an acceptable trade-off given the PR objective to reduce memory during scheduling, but could be optimized further if execution-time performance becomes a concern.

449-460: LGTM - Subtree scheduling correctly uses root TXID pattern.

The nested sweep scheduling consistently uses subTreeRootTxid extracted from the subtree, maintaining the new API contract throughout the recursive scheduling process.

internal/core/application/sweeper.go

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

internal/core/application/sweeper.go (2)

88-103: Startup restore now keyed by vtxo root txid – behavior looks preserved

Switching the startup restore path to createBatchSweepTask(txid, flatVtxoTree.RootTxid()) preserves the previous behavior while removing the need to hold a *tree.TxTree in memory for each sweepable round, which matches the PR goal. Assuming FlatTxTree.RootTxid() is implemented without fully decoding all PSBTs (or is at least cheaper than NewTxTree), this looks like a good trade‑off; if it still walks/decodes the whole tree, you might consider memoizing or storing the root txid alongside the flat representation to fully realize the scheduling perf win.

325-349: Root‑txid–based task IDs and expiration updates are coherent; consider de‑duplicating tree re‑hydration

Using vtxoTreeRootTxid as the sweeperTask.id and as the lookup key in updateVtxoExpirationTime keeps scheduling and expiration updates aligned on the same identifier, and the logging with the "root" field makes it clear which subtree is being handled. One thing to watch is that updateVtxoExpirationTime now re‑hydrates the full round tree (GetRoundVtxoTree + tree.NewTxTree + Find) on every scheduling call, while createBatchSweepTask also re‑hydrates the tree at execution time; if round trees are large or sweeps frequent, you might want to factor out a small helper (e.g., “find subtree by (commitmentTxid, rootTxid)”) or pass an already‑decoded tree where available to avoid double work on hot paths. The use of context.Background() here is consistent with our existing pattern for long‑lived sweep flows that should outlive request contexts. Based on learnings, this design choice makes sense; the refactor is purely optional.

Also applies to: 717-749

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 919a3fb and 93aad50.

📒 Files selected for processing (1)

internal/core/application/sweeper.go (9 hunks)

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-08-28T08:21:01.170Z

Learnt from: louisinger
Repo: arkade-os/arkd PR: 686
File: internal/core/application/fraud.go:47-61
Timestamp: 2025-08-28T08:21:01.170Z
Learning: In reactToFraud function in internal/core/application/fraud.go, the goroutine that waits for confirmation and schedules checkpoint sweep should use context.Background() instead of the request context, as this is intentional design to decouple the checkpoint sweep scheduling from the request lifetime.

Applied to files:

internal/core/application/sweeper.go

🧬 Code graph analysis (1)

internal/core/application/sweeper.go (1)

pkg/ark-lib/tree/tx_tree.go (1)

NewTxTree (71-135)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: unit tests
GitHub Check: integration tests

🔇 Additional comments (1)

internal/core/application/sweeper.go (1)

396-563: Lazy subtree resolution in createBatchSweepTask and updated comments look good

Re‑hydrating the round vtxo tree inside createBatchSweepTask and then locating the subtree via roundVtxoTree.Find(vtxoTreeRootTxid) cleanly implements the new root‑txid–based scheduling model, and the explicit error when the subtree isn’t found should make data issues evident in logs. The subsequent logic—deriving subtrees with computeSubTrees, scheduling follow‑up sweeps using their root txids, and the clarified comments around “add the input anyway” and assuming a batch output is spent when the first vtxo is swept/unrolled—preserves the previous behavior while making intent clearer. The extra pass that filters out already‑spent outputs before building the sweep transaction is also well documented by the new comments; no functional issues stand out in this block.

batch sweep task: use vtxo tree root txid

3b0223d

louisinger requested review from altafan and sekulicd December 2, 2025 20:18

coderabbitai bot reviewed Dec 2, 2025

View reviewed changes

Lint

919a3fb

coderabbitai bot reviewed Dec 3, 2025

View reviewed changes

internal/core/application/sweeper.go Outdated Show resolved Hide resolved

Typo

93aad50

coderabbitai bot reviewed Dec 3, 2025

View reviewed changes

altafan merged commit c28ac48 into arkade-os:master Dec 3, 2025
5 checks passed

coderabbitai bot mentioned this pull request Dec 8, 2025

Wait for confirmation before scheduling sweep task #838

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimize `scheduleBatchSweepTask` function #850

Optimize `scheduleBatchSweepTask` function #850

Uh oh!

louisinger commented Dec 2, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 2, 2025 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Optimize scheduleBatchSweepTask function #850

Optimize scheduleBatchSweepTask function #850

Uh oh!

Conversation

louisinger commented Dec 2, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Possibly related PRs

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Optimize `scheduleBatchSweepTask` function #850

Optimize `scheduleBatchSweepTask` function #850

louisinger commented Dec 2, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 2, 2025 •

edited

Loading