Improve compatibility with tokio #1367

devrandom · 2022-03-17T12:18:40Z

Background

The VLS project is using gRPC to connect to a remote signer. This uses tokio/tonic.

Symptom

If the gRPC client is created in the default tokio runtime, a deadlock may result.

Discussion

The lightning-net-tokio calls into PeerManager sync code. This code takes a lock on the PeerManager.peers Mutex, which is held across any calls to the signer.

The ldksample calls PeerManager.get_peer_node_ids in do_connect_peer, which also takes a lock on said Mutex.

The above sync code can stall the I/O reactor, resulting in the signer gRPC call never completing and the peers Mutex never being released.

A possible solution is to always call task::block_in_place or perhaps Handle::spawn_blocking when calling any Mutex-using sync code from inside lightning-net-tokio and ldksample.

The text was updated successfully, but these errors were encountered:

devrandom · 2022-03-17T21:06:59Z

On further thought, a more general description of the issue is "holding an std Mutex across an await that performs tokio IO".

I believe that BitcoindClient in the LDK sample suffers from this.

The deadlock scenario is:

thread 1: locks mutex, then starts IO
thread 2: has the IO reactor, tries to lock mutex

Thread 2 is blocked on the mutex, and therefore the IO reactor is stalled. Thread 1 is stalled waiting for IO to complete. No progress can be made - deadlock.

TheBlueMatt · 2022-03-17T21:25:15Z

Right, I believe the issue here is in thread 1 there - we are in a sync context, take a syn mutex, and then move to an async context. In general we should never be going from a sync context to (blocking on) an async context.

devrandom · 2022-05-17T10:14:21Z

Another variation of this is likely the cause of L2-Technology/sensei#73:

thread 1: takes a lock on sync mutex, then spawns an LDK user task (e.g. persistence) onto thread 2, blocks waiting for task
thread 2: dispatches an LDK task, which tries to lock same mutex

Thread 1 can't make progress, because the task it is waiting is starved on thread 2. Thread 2 can't make progress because of the sync lock wait.

My previous conclusion was that it's never safe to spawn user tasks on the runtime LDK uses. That's why we use the following code to spawn lnrod user tasks in a separate runtime:

https://gitlab.com/lightning-signer/lnrod/-/blob/3e155d2ccecf45212a1bda74e7581a10f0e7ffd5/src/signer/vls.rs#L227-232

TheBlueMatt · 2022-05-17T16:29:00Z

I don't quite understand how that is a deadlock - futures aren't tied to a thread, so a thread 3 should have no problem working on the future from thread 1, at least if I understand tokio correctly.

devrandom · 2022-05-17T16:45:06Z

I think tokio can't move tasks between threads in general - it can steal work in some cases as an optimization, but not in every case.

TheBlueMatt · 2022-05-17T17:06:01Z

Ah! tokio discord agrees - there's nothing preventing work from being stolen, but tokio doesn't do it aggressively, if other threads are sleeping they aren't going to wake up to steal work.

This fixes a deadlock issue that appears with too many concurrent requtests. This solution was taken from devrandom's issue: lightningdevkit/rust-lightning#1367 Another solution that would probably work is that lightning net tokio wraps our sync calls with `tokio::task::spawn_blocking(|| sync_call()).await.unwarp()`, but not sure how would this affect performace of other users with no async code requirements. Note that peer_manager_tests now don't need to be anotated with `flavor = "multi_thread"`, that's because the runtime we block_on (inside lightning net tokio) is the artificial runtime created in `api::lightning::server` which is already "multi_thread"

devrandom mentioned this issue May 16, 2022

Python integration test L2-Technology/sensei#73

Merged

TheBlueMatt mentioned this issue May 16, 2022

Async KV Store Persister #1470

Open

TheBlueMatt mentioned this issue May 17, 2022

use separate runtime for ldk tasks L2-Technology/sensei#74

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve compatibility with tokio #1367

Improve compatibility with tokio #1367

devrandom commented Mar 17, 2022 •

edited

Loading

devrandom commented Mar 17, 2022

TheBlueMatt commented Mar 17, 2022

devrandom commented May 17, 2022

TheBlueMatt commented May 17, 2022

devrandom commented May 17, 2022

TheBlueMatt commented May 17, 2022

Improve compatibility with tokio #1367

Improve compatibility with tokio #1367

Comments

devrandom commented Mar 17, 2022 • edited Loading

Background

Symptom

Discussion

devrandom commented Mar 17, 2022

TheBlueMatt commented Mar 17, 2022

devrandom commented May 17, 2022

TheBlueMatt commented May 17, 2022

devrandom commented May 17, 2022

TheBlueMatt commented May 17, 2022

devrandom commented Mar 17, 2022 •

edited

Loading