fix: replace `abort()` with cooperative wait in `wait_for_run_task` by xdustinface · Pull Request #576 · dashpay/rust-dashcore

xdustinface · 2026-03-23T05:51:26Z

abort() can interrupt the cleanup sequence in DashSpvClient::run() (the monitor_shutdown.cancel() + tokio::join!), leaving monitor tasks running after FFI callback pointers are freed. Use cooperative wait with a timeout fallback instead.

Based on:

refactor: move event callback dispatch from FFI into DashSpvClient #572

Summary by CodeRabbit

Bug Fixes
- Improved client shutdown: waits up to 5 seconds for background tasks to finish cooperatively, logs warnings on errors or timeouts, and falls back to forceful termination only after timeout — resulting in more stable shutdown and cleaner resource cleanup.

coderabbitai · 2026-03-23T05:51:32Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7bc6fd07-48cc-4d4a-b914-e8564720c02e

📥 Commits

Reviewing files that changed from the base of the PR and between 3292274 and 8be09a3.

📒 Files selected for processing (1)

dash-spv-ffi/src/client.rs

📝 Walkthrough

Walkthrough

Replaces unconditional abort of the client's run task with a cooperative shutdown: introduces RUN_TASK_SHUTDOWN_TIMEOUT (5s) and wait_for_run_task() which awaits the run task with a timeout, logs on join errors, and aborts only if the timeout elapses. stop_client_internal and dash_spv_ffi_client_destroy use this flow.

Changes

Cohort / File(s)	Summary
Graceful Task Shutdown `dash-spv-ffi/src/client.rs`	Added `RUN_TASK_SHUTDOWN_TIMEOUT` (5s). Replaced direct `abort()` usage with `wait_for_run_task()` which uses `tokio::time::timeout` to await the run task, logs join errors, and aborts only after timeout. Updated `stop_client_internal` and `dash_spv_ffi_client_destroy` to call the new wait-based shutdown.

Sequence Diagram(s)

sequenceDiagram
    participant API as Client API
    participant Client as FFIDashSpvClient
    participant Task as RunTask
    participant Timer as Timeout

    API->>Client: request stop / destroy
    Client->>Client: cancel shutdown_token
    Client->>Task: wait_for_run_task()
    par run finishes before timeout
        Task-->>Client: finishes
        Client-->>API: return (done)
    and run does NOT finish before timeout
        alt timeout elapsed
            Timer-->>Client: timeout
            Client->>Task: task.abort()
            Task-->>Client: aborted join (await)
            Client-->>API: return (aborted)
        end
    end

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I waited five seconds as the task took its time,
Gentle as clover, no abrupt end to the rhyme.
If it lingers too long, I give a polite tap—
Then tidy the threads and fold up my lap. ✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately describes the main change: replacing abort() with a cooperative wait mechanism in wait_for_run_task, which is the core fix in this pull request.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/replace-abort

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

codecov · 2026-03-23T05:55:20Z

Codecov Report

❌ Patch coverage is 61.53846% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 67.00%. Comparing base (2f22194) to head (8be09a3).
⚠️ Report is 6 commits behind head on v0.42-dev.

Files with missing lines	Patch %	Lines
dash-spv-ffi/src/client.rs	61.53%	5 Missing ⚠️

Additional details and impacted files

@@              Coverage Diff              @@
##           v0.42-dev     #576      +/-   ##
=============================================
- Coverage      67.03%   67.00%   -0.04%     
=============================================
  Files            320      320              
  Lines          67249    67255       +6     
=============================================
- Hits           45083    45062      -21     
- Misses         22166    22193      +27

Flag	Coverage Δ
core	`75.21% <ø> (ø)`
ffi	`35.47% <61.53%> (-0.01%)`	⬇️
rpc	`19.92% <ø> (ø)`
spv	`83.61% <ø> (-0.17%)`	⬇️
wallet	`66.63% <ø> (ø)`

Files with missing lines	Coverage Δ
dash-spv-ffi/src/client.rs	`57.81% <61.53%> (-0.79%)`	⬇️

... and 3 files with indirect coverage changes

github-actions · 2026-03-23T06:06:43Z

This PR has merge conflicts with the base branch. Please rebase or merge the base branch into your branch to resolve them.

`abort()` can interrupt the cleanup sequence in `DashSpvClient::run()` (the `monitor_shutdown.cancel()` + `tokio::join!`), leaving monitor tasks running after FFI callback pointers are freed. Use cooperative wait with a timeout fallback instead.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dash-spv-ffi/src/client.rs`:
- Around line 117-122: Add unit tests covering both cooperative shutdown and
timeout fallback for the new wait_for_run_task behavior: write one test that
starts DashSpvClient::run() (or a test helper that spawns run task), signals the
shutdown_token and asserts the run task exits without being aborted (cooperative
path), and a second test that simulates a non-responsive run loop (e.g., by
holding the run future) so wait_for_run_task hits its timeout and aborts the
task, asserting the abort path was used; target the wait_for_run_task method and
any test-helper hooks around DashSpvClient::run(), using
tokio::time::pause/advance or similar to deterministically drive the timeout.
- Around line 125-133: The timeout branch in the shutdown sequence (where
self.runtime.block_on(async { tokio::time::timeout(RUN_TASK_SHUTDOWN_TIMEOUT,
task).await })) only logs that it will "abort" but never actually cancels the
run task; update the branch that checks finished.is_err() to call task.abort()
(or the appropriate Cancel/abort method on the JoinHandle) after logging, and
optionally await or poll the handle to ensure it terminated; reference the
variables/functions runtime, block_on, tokio::time::timeout,
RUN_TASK_SHUTDOWN_TIMEOUT and the task/JoinHandle when making the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a39adbd9-16da-4e92-9b56-cd8ac16888fb

📥 Commits

Reviewing files that changed from the base of the PR and between 2f22194 and b334da6.

📒 Files selected for processing (1)

dash-spv-ffi/src/client.rs

dash-spv-ffi/src/client.rs

Addresses CodeRabbit review comment on PR #576 #576 (comment)

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

dash-spv-ffi/src/client.rs (1)
351-368: ⚠️ Potential issue | 🟠 Major

Shutdown ordering in dash_spv_ffi_client_destroy conflicts with stop_client_internal and may undermine the cooperative shutdown fix.

The destroy function calls stop() before wait_for_run_task(), while stop_client_internal does the opposite—wait_for_run_task() then stop(). This ordering difference contradicts the PR's intent to ensure run() completes its cleanup before other operations interfere.

Since stop() initiates shutdown of the sync coordinator, network, and storage while the run task may still be in its cleanup phase (joining tasks), the current destroy ordering could still cause the race condition this PR aims to prevent.
Proposed fix
 pub unsafe extern "C" fn dash_spv_ffi_client_destroy(client: *mut FFIDashSpvClient) {
     if !client.is_null() {
         let client = Box::from_raw(client);

         // Cancel shutdown token to stop all tasks
         client.shutdown_token.cancel();

+        // Wait for the run task to finish (cooperative, with timeout fallback)
+        client.wait_for_run_task();
+
         // Stop the SPV client
         client.runtime.block_on(async {
             let _ = client.inner.stop().await;
         });

-        // Wait for the run task to finish (cooperative, with timeout fallback)
-        client.wait_for_run_task();
-
         tracing::info!("FFI client destroyed and all tasks cleaned up");
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dash-spv-ffi/src/client.rs` around lines 351 - 368,
dash_spv_ffi_client_destroy currently calls client.inner.stop() before
client.wait_for_run_task(), which contradicts stop_client_internal's ordering
and can reintroduce the race; change the shutdown sequence in
dash_spv_ffi_client_destroy so after turning the Box from raw and cancelling the
shutdown_token you first call client.wait_for_run_task() (using the existing
method) and only then call client.runtime.block_on(async { let _ =
client.inner.stop().await; }); keep the shutdown_token cancellation and final
tracing::info call as-is so run() can finish cleanup before stop() tears down
coordinator/network/storage.

🧹 Nitpick comments (1)

dash-spv-ffi/src/client.rs (1)
122-123: Consider handling poisoned mutex for defensive robustness.

The unwrap() on the mutex lock is acceptable here since a poisoned mutex would indicate a panic during run_task manipulation—an unrecoverable state. However, for defensive coding, you could use lock().ok().and_then(|mut g| g.take()) to silently treat a poisoned mutex as "no task to wait for" rather than panicking during teardown.
♻️ Optional defensive alternative
     fn wait_for_run_task(&self) {
-        let task = self.run_task.lock().unwrap().take();
+        let task = self.run_task.lock().ok().and_then(|mut guard| guard.take());
         if let Some(mut task) = task {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dash-spv-ffi/src/client.rs` around lines 122 - 123, The current
wait_for_run_task method uses self.run_task.lock().unwrap().take() which will
panic on a poisoned mutex; change it to handle poisoning defensively by
replacing the unwrap() call with lock().ok().and_then(|mut guard| guard.take())
(or an equivalent match that treats a poisoned lock as None) so
wait_for_run_task returns/acts as if there is no task instead of panicking;
update any subsequent code that assumes a task was obtained to handle the None
case gracefully.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@dash-spv-ffi/src/client.rs`:
- Around line 351-368: dash_spv_ffi_client_destroy currently calls
client.inner.stop() before client.wait_for_run_task(), which contradicts
stop_client_internal's ordering and can reintroduce the race; change the
shutdown sequence in dash_spv_ffi_client_destroy so after turning the Box from
raw and cancelling the shutdown_token you first call client.wait_for_run_task()
(using the existing method) and only then call client.runtime.block_on(async {
let _ = client.inner.stop().await; }); keep the shutdown_token cancellation and
final tracing::info call as-is so run() can finish cleanup before stop() tears
down coordinator/network/storage.

---

Nitpick comments:
In `@dash-spv-ffi/src/client.rs`:
- Around line 122-123: The current wait_for_run_task method uses
self.run_task.lock().unwrap().take() which will panic on a poisoned mutex;
change it to handle poisoning defensively by replacing the unwrap() call with
lock().ok().and_then(|mut guard| guard.take()) (or an equivalent match that
treats a poisoned lock as None) so wait_for_run_task returns/acts as if there is
no task instead of panicking; update any subsequent code that assumes a task was
obtained to handle the None case gracefully.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 9588d8e5-ae37-4f9b-9b11-bf1746a99088

📥 Commits

Reviewing files that changed from the base of the PR and between b334da6 and 3292274.

📒 Files selected for processing (1)

dash-spv-ffi/src/client.rs

Wait for the run task before calling `stop()` so `run()` can finish its own cleanup. Matches the ordering in `stop_client_internal`.

xdustinface force-pushed the refactor/move-callbacks-to-crates branch from aaddece to 23accc0 Compare March 23, 2026 06:04

github-actions bot added the merge-conflict The PR conflicts with the target branch. label Mar 23, 2026

Base automatically changed from refactor/move-callbacks-to-crates to v0.42-dev March 25, 2026 02:13

xdustinface force-pushed the fix/replace-abort branch from fabbdad to b334da6 Compare March 31, 2026 00:35

github-actions bot removed the merge-conflict The PR conflicts with the target branch. label Mar 31, 2026

xdustinface marked this pull request as ready for review March 31, 2026 00:45

coderabbitai bot requested changes Mar 31, 2026

View reviewed changes

dash-spv-ffi/src/client.rs Show resolved Hide resolved

dash-spv-ffi/src/client.rs Outdated Show resolved Hide resolved

fix: abort run task on shutdown timeout in wait_for_run_task

3292274

Addresses CodeRabbit review comment on PR #576 #576 (comment)

coderabbitai bot reviewed Mar 31, 2026

View reviewed changes

coderabbitai bot previously approved these changes Mar 31, 2026

View reviewed changes

github-actions bot added the ready-for-review CodeRabbit has approved this PR label Mar 31, 2026

fix: reorder shutdown sequence in dash_spv_ffi_client_destroy

8be09a3

Wait for the run task before calling `stop()` so `run()` can finish its own cleanup. Matches the ordering in `stop_client_internal`.

xdustinface dismissed coderabbitai[bot]’s stale review via 8be09a3 March 31, 2026 03:02

github-actions bot removed the ready-for-review CodeRabbit has approved this PR label Mar 31, 2026

coderabbitai bot approved these changes Mar 31, 2026

View reviewed changes

github-actions bot added the ready-for-review CodeRabbit has approved this PR label Mar 31, 2026

xdustinface requested a review from ZocoLini March 31, 2026 03:23

ZocoLini approved these changes Mar 31, 2026

View reviewed changes

xdustinface merged commit 518fdb2 into v0.42-dev Apr 1, 2026
42 checks passed

xdustinface deleted the fix/replace-abort branch April 1, 2026 02:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: replace `abort()` with cooperative wait in `wait_for_run_task`#576

fix: replace `abort()` with cooperative wait in `wait_for_run_task`#576
xdustinface merged 3 commits intov0.42-devfrom
fix/replace-abort

xdustinface commented Mar 23, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 23, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

codecov bot commented Mar 23, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

xdustinface commented Mar 23, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Poem

Uh oh!

codecov bot commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Mar 23, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

xdustinface commented Mar 23, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 23, 2026 •

edited

Loading

codecov bot commented Mar 23, 2026 •

edited

Loading