Skip to content

Conversation

@louisinger
Copy link
Collaborator

@louisinger louisinger commented Dec 2, 2025

This PR aims to optimize the scheduleBatchSweepTask function. Instead of depending on the VtxoTree struct, it relies on the vtxo tree root transaction id. Then, once the sweep is triggered, we deserialize and fetch the whole vtxo tree.

Thus, the scheduler does not need to keep in memory all the sweepable vtxo tree structs. It also saves bunch of PSBTs decoding while scheduling sweep.

related to #823

@sekulicd @altafan please review

Summary by CodeRabbit

  • New Features

    • Exposed a method to obtain a transaction subtree's root transaction ID directly.
  • Refactor

    • Sweep scheduling and related task creation now use root transaction IDs instead of full tree objects, simplifying execution flow and reducing runtime bookkeeping.
    • Expiration updates now target subtrees by root ID.
  • Tests

    • Added serialization/roundtrip tests validating leaf integrity and correct root identification.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 2, 2025

Walkthrough

Switched sweeper scheduling to pass vtxo subtree root TXID strings instead of full TxTree objects. Sweeper tasks lazily resolve subtrees at execution by loading round data. Added FlatTxTree.RootTxid() and extended serialization tests to assert root/leaf consistency.

Changes

Cohort / File(s) Summary
Sweep scheduling refactor
internal/core/application/service.go, internal/core/application/sweeper.go
Replaced *tree.TxTree parameters with vtxoTreeRootTxid string. Updated scheduleBatchSweep, createBatchSweepTask, and updateVtxoExpirationTime signatures and call sites. Tasks now accept a root TXID, load round data at runtime, locate the subtree by root TXID, and schedule/sweep subtrees by root TXID. Removed in-memory TxTree captures and related nil checks.
Tree root ID helper
pkg/ark-lib/tree/tx_tree.go
Added func (c FlatTxTree) RootTxid() string to compute and return the root transaction ID for a flat tree.
Tree serialization tests
pkg/ark-lib/tree/tx_tree_test.go
Added assertions after serialization roundtrip to verify leaves map to deserialized leaf nodes (matching TxIDs, no children) and that the flat tree root TXID matches the deserialized root's unsigned TxID.

Sequence Diagram(s)

mermaid
sequenceDiagram
autonumber
actor Scheduler
participant Service
participant Sweeper
participant RoundStore
Scheduler->>Service: schedule sweep (uses FlatTxTree.RootTxid())
Service->>Sweeper: scheduleBatchSweep(expiration, commitmentTxid, vtxoTreeRootTxid)
Sweeper->>RoundStore: loadRound(commitmentTxid)
RoundStore-->>Sweeper: round data (flat vtxo tree)
Sweeper->>Sweeper: locate subtree by vtxoTreeRootTxid
alt subtree found
Sweeper->>Sweeper: create and execute sweep task for subtree
Sweeper->>RoundStore: updateVtxoExpirationTime(commitmentTxid, vtxoTreeRootTxid, exp)
RoundStore-->>Sweeper: ack
else not found
Sweeper-->>Service: return error / log failure
end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Inspect createBatchSweepTask for correct lazy loading and subtree lookup logic.
  • Confirm all call sites updated to new signatures; search for lingering *tree.TxTree captures.
  • Review RootTxid() for edge cases (single-node tree, malformed or cyclic input).
  • Verify error and logging paths when subtree lookup fails.

Possibly related issues

Possibly related PRs

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Title check ⚠️ Warning The title references scheduleBatchSweepTask but the main change is refactoring scheduleBatchSweep and related functions to use root TXID instead of VtxoTree objects; the title doesn't capture this core optimization. Consider a more accurate title like 'Refactor sweep scheduling to use root TXID instead of VtxoTree objects' to better reflect the primary optimization.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
pkg/ark-lib/tree/tx_tree.go (1)

38-58: Consider handling edge cases for malformed trees.

The RootTxid() method assumes a well-formed tree with exactly one root. If multiple roots exist or no root is found, the current implementation returns the first non-child node or an empty string without signaling an error. Consider:

  1. Logging a warning when no root is found (line 57) instead of silently returning an empty string
  2. Validating that exactly one root exists, similar to how NewTxTree() validates at lines 113-119
 func (c FlatTxTree) RootTxid() string {
 	if len(c) == 1 {
 		return c[0].Txid
 	}
 
 	// the root is the node not being a child of another one
 	allchildren := make(map[string]struct{})
 	for _, node := range c {
 		for _, child := range node.Children {
 			allchildren[child] = struct{}{}
 		}
 	}
 
 	for _, node := range c {
 		if _, ok := allchildren[node.Txid]; !ok {
 			return node.Txid
 		}
 	}
 
-	return ""
+	// No root found - this indicates a malformed tree
+	return ""
 }
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7ad67c6 and 3b0223d.

📒 Files selected for processing (4)
  • internal/core/application/service.go (1 hunks)
  • internal/core/application/sweeper.go (6 hunks)
  • pkg/ark-lib/tree/tx_tree.go (1 hunks)
  • pkg/ark-lib/tree/tx_tree_test.go (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-28T08:21:01.170Z
Learnt from: louisinger
Repo: arkade-os/arkd PR: 686
File: internal/core/application/fraud.go:47-61
Timestamp: 2025-08-28T08:21:01.170Z
Learning: In reactToFraud function in internal/core/application/fraud.go, the goroutine that waits for confirmation and schedules checkpoint sweep should use context.Background() instead of the request context, as this is intentional design to decouple the checkpoint sweep scheduling from the request lifetime.

Applied to files:

  • internal/core/application/service.go
  • internal/core/application/sweeper.go
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Build and Scan
  • GitHub Check: unit tests
  • GitHub Check: integration tests
🔇 Additional comments (6)
pkg/ark-lib/tree/tx_tree_test.go (1)

73-83: LGTM!

The new test coverage for RootTxid() validates that the root TXID extracted from the flat tree matches the deserialized tree's root transaction ID. The test also verifies leaf consistency, ensuring the serialization roundtrip works correctly.

internal/core/application/service.go (1)

3449-3449: LGTM!

The change to pass round.VtxoTree.RootTxid() instead of the full tree object aligns with the PR objective to reduce memory usage in the scheduler. The full vtxo tree will now be lazily resolved during sweep execution.

internal/core/application/sweeper.go (4)

99-99: LGTM!

The change to pass flatVtxoTree.RootTxid() instead of the full tree object is consistent with the lazy tree resolution approach. The full tree will be reconstructed when the task executes.


327-349: LGTM!

The refactored scheduleBatchSweep signature correctly accepts the root transaction ID instead of the full tree. The task ID now uses the root TXID, which is appropriate since each subtree's root uniquely identifies the sweep task.


715-748: LGTM!

The refactored updateVtxoExpirationTime correctly implements the lazy tree resolution pattern. It loads the round's vtxo tree, locates the subtree by root TXID, and updates expiration times on the subtree's leaves. The error handling is appropriate.


398-424: Verify error handling for tree resolution.

The new tree resolution logic correctly defers loading and deserialization to task execution time. However, verify the following:

  1. If tree.NewTxTree() fails at line 412, confirm that the scheduler logs the error appropriately when the task fails.

  2. If the vtxoTree is not found at line 418, consider adding debug logging with context about which trees were available in the round to aid troubleshooting.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
internal/core/application/sweeper.go (2)

326-350: Consider validating vtxoTreeRootTxid is non-empty.

The function accepts a string parameter but doesn't validate it before using it as the task ID. An empty string would result in a task with an empty ID, which could cause issues with task deduplication logic in scheduleTask.

 func (s *sweeper) scheduleBatchSweep(
 	expirationTimestamp int64, commitmentTxid, vtxoTreeRootTxid string,
 ) error {
+	if vtxoTreeRootTxid == "" {
+		return fmt.Errorf("vtxoTreeRootTxid cannot be empty")
+	}
 	if err := s.scheduleTask(sweeperTask{

717-750: Consider extracting common tree loading logic.

This function duplicates the tree loading pattern from createBatchSweepTask (lines 406-421):

  1. Fetch flat tree from repository
  2. Construct TxTree with tree.NewTxTree()
  3. Find subtree with Find(vtxoTreeRootTxid)

A helper could reduce duplication:

func (s *sweeper) loadVtxoSubtree(ctx context.Context, commitmentTxid, vtxoTreeRootTxid string) (*tree.TxTree, error) {
    flatTree, err := s.repoManager.Rounds().GetRoundVtxoTree(ctx, commitmentTxid)
    if err != nil {
        return nil, err
    }
    roundTree, err := tree.NewTxTree(flatTree)
    if err != nil {
        return nil, err
    }
    subTree := roundTree.Find(vtxoTreeRootTxid)
    if subTree == nil {
        return nil, fmt.Errorf("vtxo tree %s not found in round %s", vtxoTreeRootTxid, commitmentTxid)
    }
    return subTree, nil
}
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3b0223d and 919a3fb.

📒 Files selected for processing (1)
  • internal/core/application/sweeper.go (9 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-28T08:21:01.170Z
Learnt from: louisinger
Repo: arkade-os/arkd PR: 686
File: internal/core/application/fraud.go:47-61
Timestamp: 2025-08-28T08:21:01.170Z
Learning: In reactToFraud function in internal/core/application/fraud.go, the goroutine that waits for confirmation and schedules checkpoint sweep should use context.Background() instead of the request context, as this is intentional design to decouple the checkpoint sweep scheduling from the request lifetime.

Applied to files:

  • internal/core/application/sweeper.go
🧬 Code graph analysis (1)
internal/core/application/sweeper.go (1)
pkg/ark-lib/tree/tx_tree.go (1)
  • NewTxTree (71-135)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: integration tests
  • GitHub Check: unit tests
🔇 Additional comments (3)
internal/core/application/sweeper.go (3)

98-102: LGTM - Correctly implements lazy loading pattern.

The change to pass flatVtxoTree.RootTxid() instead of constructing the full TxTree achieves the memory optimization goal. The tree will be reconstructed at execution time in createBatchSweepTask.


400-421: LGTM - Lazy loading correctly implemented.

The tree is now reconstructed at execution time from round data, achieving the memory optimization goal. The Find() method correctly locates the subtree by root TXID.

Note that if multiple subtrees from the same round are scheduled, the round data and tree reconstruction will occur for each subtree independently. This is an acceptable trade-off given the PR objective to reduce memory during scheduling, but could be optimized further if execution-time performance becomes a concern.


449-460: LGTM - Subtree scheduling correctly uses root TXID pattern.

The nested sweep scheduling consistently uses subTreeRootTxid extracted from the subtree, maintaining the new API contract throughout the recursive scheduling process.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
internal/core/application/sweeper.go (2)

88-103: Startup restore now keyed by vtxo root txid – behavior looks preserved

Switching the startup restore path to createBatchSweepTask(txid, flatVtxoTree.RootTxid()) preserves the previous behavior while removing the need to hold a *tree.TxTree in memory for each sweepable round, which matches the PR goal. Assuming FlatTxTree.RootTxid() is implemented without fully decoding all PSBTs (or is at least cheaper than NewTxTree), this looks like a good trade‑off; if it still walks/decodes the whole tree, you might consider memoizing or storing the root txid alongside the flat representation to fully realize the scheduling perf win.


325-349: Root‑txid–based task IDs and expiration updates are coherent; consider de‑duplicating tree re‑hydration

Using vtxoTreeRootTxid as the sweeperTask.id and as the lookup key in updateVtxoExpirationTime keeps scheduling and expiration updates aligned on the same identifier, and the logging with the "root" field makes it clear which subtree is being handled. One thing to watch is that updateVtxoExpirationTime now re‑hydrates the full round tree (GetRoundVtxoTree + tree.NewTxTree + Find) on every scheduling call, while createBatchSweepTask also re‑hydrates the tree at execution time; if round trees are large or sweeps frequent, you might want to factor out a small helper (e.g., “find subtree by (commitmentTxid, rootTxid)”) or pass an already‑decoded tree where available to avoid double work on hot paths. The use of context.Background() here is consistent with our existing pattern for long‑lived sweep flows that should outlive request contexts. Based on learnings, this design choice makes sense; the refactor is purely optional.

Also applies to: 717-749

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 919a3fb and 93aad50.

📒 Files selected for processing (1)
  • internal/core/application/sweeper.go (9 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-28T08:21:01.170Z
Learnt from: louisinger
Repo: arkade-os/arkd PR: 686
File: internal/core/application/fraud.go:47-61
Timestamp: 2025-08-28T08:21:01.170Z
Learning: In reactToFraud function in internal/core/application/fraud.go, the goroutine that waits for confirmation and schedules checkpoint sweep should use context.Background() instead of the request context, as this is intentional design to decouple the checkpoint sweep scheduling from the request lifetime.

Applied to files:

  • internal/core/application/sweeper.go
🧬 Code graph analysis (1)
internal/core/application/sweeper.go (1)
pkg/ark-lib/tree/tx_tree.go (1)
  • NewTxTree (71-135)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: unit tests
  • GitHub Check: integration tests
🔇 Additional comments (1)
internal/core/application/sweeper.go (1)

396-563: Lazy subtree resolution in createBatchSweepTask and updated comments look good

Re‑hydrating the round vtxo tree inside createBatchSweepTask and then locating the subtree via roundVtxoTree.Find(vtxoTreeRootTxid) cleanly implements the new root‑txid–based scheduling model, and the explicit error when the subtree isn’t found should make data issues evident in logs. The subsequent logic—deriving subtrees with computeSubTrees, scheduling follow‑up sweeps using their root txids, and the clarified comments around “add the input anyway” and assuming a batch output is spent when the first vtxo is swept/unrolled—preserves the previous behavior while making intent clearer. The extra pass that filters out already‑spent outputs before building the sweep transaction is also well documented by the new comments; no functional issues stand out in this block.

@altafan altafan merged commit c28ac48 into arkade-os:master Dec 3, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants