Futures returned in library are not send #165

LLFourn · 2020-11-16T04:26:44Z

Send is explicitly opted out of here:

Line 49 in 43cb033

#[async_trait(?Send)]

.

This is a bit of a problem. For example, after potentially receiving some funds you might like to do:

tokio::spawn(wallet.sync(...).then(|res| info!(the result of the sync was "{:?}", res))

To sync the wallet in the background. This is not possible since the future must be Send to send to tokio::spawn.
It will take some re-engineering of the internals to make this work.

@afilini do you have opinions on how to address this?

The text was updated successfully, but these errors were encountered:

afilini · 2020-11-16T10:26:18Z

Isn't this somewhat connected to the discussion we had about making the wallet thread safe? If the wallet was thread safe then you could spawn a background thread and call sync() in that thread, without necessarily having to spawn a task using tokio and going through the async interface.

Actually, I think having a thread-safe wallet is probably the first requirement to then make the futures Send, otherwise it doesn't make sense to have a background task run sync() if you can't do anything else on the wallet because there's a Mutex that locks everything.

I guess the biggest problem with thread-safe wallets in general is that the database is in an inconsistent state during sync. We would probably need to have transactions of some sorts, so that you only commit the full changes at the end of sync and always keep the database in a consistent state.

afilini · 2020-11-16T10:31:23Z

So unless I have misunderstood something, I guess there are a few steps before we can get to working on Send futures:

Make Database thread-safe
Add transactional operations to Database
Update all the Blockchain types to use the transactions with Database
Make Blockchain thread-safe, or wrap it in a Mutex. Here we need some kind of locking mechanism to avoid having multiple sync running at the same time
Make Wallet thread-safe, remove RefCells. At this point you can already achieve the "sync in background" by spawning a thread and calling sync inside
??? to make futures Send

notmandatory · 2020-11-16T18:03:58Z

I agree the first priorities should be to make the Database, Blockchain and Wallet thread safe. For our bdk-jni android example the app code is spawning it's own threads from the JVM to sync the wallet or get balance updates, etc. So if bdk blocks on those calls it's not a problem.

LLFourn · 2020-11-16T22:56:42Z

The difficulty with the approach is that now you have to have two impls -- one using tokio::sync::Mutex and one using RefCell to account for no_std/WASM. I don't see how to get around this but I haven't got much experience finding work arounds. I feel it will be further complicated by maybe_await stuff because you now have to maybe_await the lock on the components as well (or just borrow_mut if using RefCell).

I think a better solution is to just remove interior mutability from Wallet (i.e. remove RefCell) and make &self mutable in methods like sync.
This would circumvent the need for all the rather involved steps @afilini mentions above.

Then if you want to do a sync in a background thread:

let wallet: Arc<Mutex<Wallet<...>>> = Arc::clone(&wallet);
tokio::spawn(async move {
    let mut wallet = wallet.lock().await;
    wallet.sync(...).await;
})

I think this will work. Although rust has a way of foiling my plans in this area.

The disadvantage of this is that now anyone wanting to use the wallet while sync is running is going to have to wait.
There could be ways of easing this in the future -- for example you expose a sync_next_batch method which only does a little bit of work (and indicates whether the sync is done). This would let another thread do something between successive calls to sync_next_batch.

thomaseizinger · 2022-01-10T03:10:31Z

I just ran into this while using bdk::Wallet within an actor framework.

I have one actor had controls access to the wallet and thus, the wallet itself does not need to be thread-safe. The message handlers are all async though and will be run on the tokio runtime.

At the moment, the lack of Send futures makes it impossible to use the EsploraBlockchain because without async-interface it panics that it would block a worker thread:

thread 'main' panicked at 'Cannot start a runtime from within a runtime. This happens because a function (like block_on) attempted to block the current thread while the thread is being used to drive asynchronous tasks.', /home/thomas/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.15.0/src/runtime/enter.rs:39:9

and with async-interface, the code doesn't compile because the async-trait implementation of Blockchain on EsploraBlockchain returns Futures that are not Send.

thomaseizinger · 2022-01-10T03:13:12Z

I think a better solution is to just remove interior mutability from Wallet (i.e. remove RefCell) and make &self mutable in methods like sync.

I would also advocate for this. Accessing the wallet from multiple threads / tasks should be an application decision and can be solved in various ways (actors, mutex, etc).

LLFourn · 2022-01-10T23:17:40Z

I've changed my thinking on this topic. After working with various database APIs they are not at all consistent about why they need to be &mut or not. For example, a database API may have &mut self for every call because it has only one connection and the underlying connection is mutable. Others may only have &mut self for writing operations but not for reading (this is what BDK database assumes). Furthermore, others may only have &mut self when starting and ending a transaction/batch update. Finally, some have no mutability at all and handle everything internally or using thread local variables like sled.

Therefore I don't think we can manage when the database should and shouldn't bee mutable and we should force the Database itself to handle thread safety i.e. make the database trait Sync. I don't think there is much of a cost to doing this and it sure would simplify a lot . Perhaps we'd need to make an exception for wasm32 so as to avoid using Mutex in MemoryDatabase. But the better solution here is just to make a proper wasm databse that uses web apis!

Side note: #461 made database updates atomic (for esplora and electrum) -- the sync logic no longer updates the database while syncing but puts everything into one big batch update at the end (see script_sync.rs). This makes things easier to reason about and means even if doing an async sync the database will never be locked across an .await point.
As far as I can tell it is sound to do multiple syncs at the same time and if not it is a problem with the sync logic. After #461 the db state should be consistent after a sync with esplora and electrum (I think!). It should probably be avoided anyway because it's redundant but it is not the responsibility of the database to ensure this.

thomaseizinger · 2022-01-13T08:47:55Z

Is making the DB thread-safe going to make the entire wallet thread-safe? Because if not, then why bother with internal mutability when users need to wrap it in a Mutex anyway?

LLFourn · 2022-01-18T22:59:17Z

Is making the DB thread-safe going to make the entire wallet thread-safe? Because if not, then why bother with internal mutability when users need to wrap it in a Mutex anyway?

Yes I think so. Well this would be necessary condition for going in this direction.

thomaseizinger · 2022-01-19T02:45:48Z

An alternative design could be to decouple the wallet and the database from each other. A call to sync could:

Require a SyncStatus struct as a parameter that can be obtained from the database
Return a SyncResult struct that can be applied to a database

This makes doing a sync a bit more verbose as users would have to do something like:

let status = db.get_sync_status();
let result = wallet.sync(status);
db.apply_sync_result(result);

However, the benefits would be that the wallet does not care whether or not the DB is mutable / immutable or sync / async. Using such a design may remove the need for Database trait entirely.

It is an ergonomics hit although I don't think it would be particularly bad. In my experience with BDK, I always wrapped the bdk::Wallet in a Wallet struct myself to provide slightly different APIs and/or compose more logic on top of it. Having to compose a few more pieces from bdk in my own Wallet struct wouldn't be a big deal here.

Going down this route would result in removal of all type parameters from Wallet and users would have to compose the wallet with a database and a client by passing data back and forth between those components. WDYT?

LLFourn · 2022-01-20T03:08:38Z

@thomaseizinger thanks for thinking more deeply about this problem. Here's the issue with your suggestion:

However, the benefits would be that the wallet does not care whether or not the DB is mutable / immutable or sync / async. Using such a design may remove the need for Database trait entirely.

The wallet would need the database trait still:

The wallet needs to use it to create transactions.
The wallet actually mutates it when generating new addresses.
In general you need to be able to query the database for stuff -- it's handy if it has the same API regardless of the underlying concrete DB.

Your line of questioning about what exactly should be the relationship between components is really important. What is a wallet? Is it even a good abstraction to base this library around? I think a lot about this but I don't think it's directly relevant here. In any case it seems to me that we do need a Database trait and making it Sync makes sense and would fix the problem at hand. I think this is true even if we adopted a sync API more like what you were suggesting unless we can stop all database queries during the sync. This might be possible but requires some careful design work!

thomaseizinger · 2022-02-04T04:59:21Z

In any case it seems to me that we do need a Database trait and making it Sync makes sense and would fix the problem at hand. I think this is true even if we adopted a sync API more like what you were suggesting unless we can stop all database queries during the sync. This might be possible but requires some careful design work!

I agree with you. Is someone currently working on this? If not, can we have some design notes on what is expected so it can be picked up?
Not being able to use an async-native backend with bdk is mildly annoying. I'd really like to move away from the blocking interface. Our work-around for #521 is to scan for 1000 addresses and together with a blocking client, this means we are blocking one of tokio's executor threads for ~4 seconds every time a sync is triggered.

Happy to contribute a PR if we can get some mentoring notes on how to change things :)

LLFourn · 2022-02-18T02:51:54Z

Ok so I will paint my grand vision of how this should go:

We separate the Blockchain traits into separate synchronous and asynchronous versions e.g. under bdk::async. The async version will of course require the futures to be Send. This is just a necessary cleanup that I think we want to do.
Instead of calling wallet.sync(&blockchain, options) (which will be the API after Remove blockchain from wallet #535) we make the API something more like what you suggested above:

let sync_session = wallet.start_sync(options)?;
let result = blockchain.sync(&sync_session)?; // this is the one that might be async
wallet.apply_sync(&result)?;

So this would fix things because we wouldn't care about whether the database is Sync we just care the sync_session is. Which isn't to say that the various databases shouldn't be Sync but it just means that we don't care about that here.

What is a SyncSession? I don't know yet. It depends on how much each sync needs to know about the database. Here's some points on this:

Right now we always do a "full" sync on esplora and electrum. Which means we don't look at what the blockheight was when we did the last sync and only get new stuff and we always sync for every address. The blockchain is provided with a database so that it doesn't download more transactions that it needs to.
If we were to at least improve this by only caring about transactions that have appeared after the last sync then maybe we wouldn't need to database at all since we can avoid downloading it by just looking at its height.

So consider the following structures

// given to the blockchain
struct SyncSession {
    // txs that were previously in the mempool.
    // So we don't need to download them again and can tell if we need to delete them.
    prev_mempool: Vec<Txid>,
    // The blockid of the last sync
    last_sync_block: (BlockHash, u32)
    // the list of addresses to look at for each keychain
    script_pubkeys_interested: HashMap<KeychainKind, Box<dyn Iterator<Item=Script> + Send>>
}

// returned from the blockchain
struct SyncResult {
     transactions: SyncTxs,
     // this will be set to last_sync_block next time
     sync_block: (BlockHahsh, u32)
     // where we are up to for each keychain
     keychain_indexes: HashMap<KeychainKind, u32>
}

enum SyncTxs {
    Partial {
       // txs we haven't seen before that are in mempool or chain now
       new: Vec<TransactionDetails>,
       // txs that were in the mempool but are now gone
       mempool_confirmed: Vec<Txid>,
       // these txs have disappeared from the mempool
       mempool_gone: Vec<Txid>
    },
    Full {
        // all transactions ever associated with this wallet.
        // wallet should hose the current database and replace it with the data here.
        txs: Vec<TransactionDetails>
    }
}

This stops us needing a reference to the Database during sync and allows us to just exchange data back and forth with the blockchain. The slight downside is we do a full download in the case where we previously synced on a stale block. But this will happen very rarely in practice like once a month if you are syncing all the time. If this is actually a problem we could store the last two blockhashes which would decrease this.

On a side not I think #521 could be fixed at the same time by architecting the script pubkey iterators properly.

Thoughts?

thomaseizinger · 2022-02-22T07:02:09Z

Thoughts?

If we can make this work then I believe this would be a really clean way of resolving the async/sync duality of blockchain operations. Pre-condition for that is that we can gather all required information in one go, do the sync and come back with a result.

With #535, perhaps the API should actually be the other way round:

blockchain.sync(&wallet).await?;

Maybe we can do the same thing for the database:

database.sync(&wallet).await?;

A wallet could by default have an in-memory storage and "sync" with a persistent backend like sqlite or seld.

notmandatory · 2022-02-25T19:35:29Z

I was discussing with @rajarshimaitra and he brought up a good point to consider, which is if always sync the database to an in-memory struct in the wallet could this be a memory problem for very big wallets, or even not so big wallets on memory constrained devices?

I like the blockchain.sync api and finding a way to split the blockchain and database from the core wallet lib.

notmandatory · 2022-02-25T19:55:03Z

Using something like an async feature flag, could we enable async versions of wallet functions, so if enabled would be called something like:

let blockchain: AsyncBlockchain = ...; // impl an async version of a blockchain
let database: AsyncDatabase = ...; // impl an async version of a database
wallet.sync(&blockchain, &database).await?;  // async version of the functions
wallet.get_new_address(&database).await?;

And if async feature not enabled:

let blockchain: Blockchain = ...; // impl a blocking version of a blockchain
let database: Database = ...; // impl a blocking version of  a database
wallet.sync(&blockchain, &database); // blocking versions of the functions
wallet.get_new_address(&database);

Though in this approach the Wallet and core lib crate would need to know about async and the code would be less clear with async and blocking versions of everything...so maybe not really an improvement. But food for thought.

EDIT:

I should have refreshed my memory on how the async-interface feature currently works. The #[maybe_async] and #[maybe_await] macros @afilini created are used to generate "async" versions of Blockchain and related Wallet methods, but only for the wasm32 target. Could this approach be extended to the Database trait and all wallet functions and not only the wasm32 target?

thomaseizinger · 2022-02-25T23:35:53Z

Feature flags really should never change APIs because they are evaluated additively by cargo.

Assume the following situation:

crate A depends on bdk without async-interface
crate B depends on bdk with async-interface

You cannot use crate A and B in the same dependency graph now because A will fail to compile because its code is written against the non-async API.

Feature flags work best if they add code, like additional functions. (Be careful with enum variants, those are also nasty).

LLFourn · 2022-02-26T22:13:22Z

Just wanted to second what @thomaseizinger said. One of the main goals of this refactoring is to remove all this maybe_async stuff and await_or_block etc so the features are additive. It may be possible to get rid of AnyBlockchain stuff in favour of just using trait objects too so we remove most of the other conditional enum variants too.

notmandatory · 2022-02-26T23:06:20Z

Thanks for the explanation, making features additive makes sense and sounds like the right way to design things. It's even in the Cargo Book which apparently I need to read more closely! 😵‍💫

cryptoquick · 2022-03-08T00:34:19Z

Ideally this would be implemented in such a way that the same code that works on web can also be easily targeted to work on desktop. Imagine a use-case where the isomorphic UI library Iced can compile to both using the bulk of the same application logic and UI code, with platform-specific edge cases handled through conditional compilation macros.

MaxFangX · 2023-01-25T06:12:17Z

FWIW I hacked out a workaround using a number of refactors. Definitely too ugly to be upstreamed but it gets the job done if you really really need BDK's futures to be Send today (which we did).

Here's the branch: https://github.com/lexe-tech/bdk/tree/max/thread-safe

And the two commits that implement it:

which at this point can be cherry-picked onto the 0.26 release.

You can confirm that the futures are indeed Send with the following:

cargo clippy --all-targets --features=async-interface,use-esplora-async --no-default-features -- --allow=warnings

Important to read the commit messages to understand the caveats for using this patch correctly (i.e. bdk::Wallet must be wrapped in an async Mutex to prevent concurrent access).

An actor that manages the `bdk::Wallet` resource. It allows us to use the wallet whilst the inevitably expensive on-chain sync is happening in the background. We would like to use the async version of `EsploraBlockchain`, but bitcoindevkit/bdk#165 is still an issue. The only hacky bit stems from the fact that we still have to implement certain non-async foreign traits and to access any `bdk::Wallet` resource we now have to go via async methods, since the actor is async. To do so, at some point we have to call an async function "from a sync context". The natural way to do so (for me) would be to use `runtime.block_on`, which schedules an async task and waits for it to resolve, blocking the thread. *But*, these non-async foreign trait methods are actually called elswhere in _async_ contexts. This leads to `runtime.block_on` panicking because `tokio` is trying to prevent us from blocking a thread in an async context. This is analogous to us misusing async and doing an expensive CPU-bound computation in an async context, but here `tokio` is able to "aid" us since `block_on` is provided by `tokio`. The solution is to use the following pattern: ```rust tokio::task::block_in_place(|| { tokio::runtime::Handle::current().block_on(async { // async code }) }); ``` From the documentation of `block_in_place`, we are able to run "the provided blocking function on the current thread without blocking the executor". We therefore avoid the panic, as we no longer block the executor when calling `block_on`. This has one final side-effect, which is that all `ln-dlc-node` async tests now need to go back to using the `multi_thread` flavour. `block_in_place` works by moving all scheduled tasks to a different worker thread and without `multi_thread` there is only one worker thread, so it just panics. Co-authored-by: Mariusz Klochowicz <mariusz@klochowicz.com>

thomaseizinger mentioned this issue Jan 10, 2022

Investigate coins not found in development mode get10101/itchysats#965

Closed

thomaseizinger mentioned this issue Jan 31, 2022

Use async-native esplora backend get10101/itchysats#947

Closed

notmandatory mentioned this issue Feb 7, 2022

Make the MemoryDatabase Send+Sync #540

Open

LLFourn mentioned this issue Mar 8, 2022

maybe_async complicates isomorphic use-cases #559

Open

LLFourn mentioned this issue Mar 8, 2022

Make futures send #560

Closed

cryptoquick mentioned this issue Mar 11, 2022

Isomorphic Compilation diba-io/bitmask-core#9

Closed

2 tasks

MaxFangX mentioned this issue Jan 20, 2023

Add transaction sync crate lightningdevkit/rust-lightning#1870

Merged

luckysori mentioned this issue Jun 6, 2023

Refactor away Mutex around BDK get10101/10101#730

Closed

luckysori mentioned this issue Jun 8, 2023

Add BdkActor get10101/10101#750

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Futures returned in library are not send #165

Futures returned in library are not send #165

LLFourn commented Nov 16, 2020

afilini commented Nov 16, 2020

afilini commented Nov 16, 2020

notmandatory commented Nov 16, 2020

LLFourn commented Nov 16, 2020 •

edited

thomaseizinger commented Jan 10, 2022

thomaseizinger commented Jan 10, 2022 •

edited

LLFourn commented Jan 10, 2022 •

edited

thomaseizinger commented Jan 13, 2022

LLFourn commented Jan 18, 2022

thomaseizinger commented Jan 19, 2022

LLFourn commented Jan 20, 2022

thomaseizinger commented Feb 4, 2022 •

edited

LLFourn commented Feb 18, 2022

thomaseizinger commented Feb 22, 2022

notmandatory commented Feb 25, 2022

notmandatory commented Feb 25, 2022 •

edited

thomaseizinger commented Feb 25, 2022

LLFourn commented Feb 26, 2022

notmandatory commented Feb 26, 2022

cryptoquick commented Mar 8, 2022

MaxFangX commented Jan 25, 2023 •

edited

Futures returned in library are not send #165

Futures returned in library are not send #165

Comments

LLFourn commented Nov 16, 2020

afilini commented Nov 16, 2020

afilini commented Nov 16, 2020

notmandatory commented Nov 16, 2020

LLFourn commented Nov 16, 2020 • edited

thomaseizinger commented Jan 10, 2022

thomaseizinger commented Jan 10, 2022 • edited

LLFourn commented Jan 10, 2022 • edited

thomaseizinger commented Jan 13, 2022

LLFourn commented Jan 18, 2022

thomaseizinger commented Jan 19, 2022

LLFourn commented Jan 20, 2022

thomaseizinger commented Feb 4, 2022 • edited

LLFourn commented Feb 18, 2022

thomaseizinger commented Feb 22, 2022

notmandatory commented Feb 25, 2022

notmandatory commented Feb 25, 2022 • edited

thomaseizinger commented Feb 25, 2022

LLFourn commented Feb 26, 2022

notmandatory commented Feb 26, 2022

cryptoquick commented Mar 8, 2022

MaxFangX commented Jan 25, 2023 • edited

LLFourn commented Nov 16, 2020 •

edited

thomaseizinger commented Jan 10, 2022 •

edited

LLFourn commented Jan 10, 2022 •

edited

thomaseizinger commented Feb 4, 2022 •

edited

notmandatory commented Feb 25, 2022 •

edited

MaxFangX commented Jan 25, 2023 •

edited