Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to wallet syncing and monitoring #359

Merged
merged 1 commit into from
Oct 25, 2019
Merged

Conversation

AdamISZ
Copy link
Member

@AdamISZ AdamISZ commented Jun 13, 2019

Idea of this PR was prompted by some related problems.

See #328 - there seems to be some flaw in full sync at least (and fast also perhaps) involving not recognising/registering new addresses to the wallet such that restarts almost always involve retrying the sync when that shouldn't be necessary.
Not obviously related, but #336 for example is a simple failure at UI level where wallet does not dynamically update the GUI; similar but more fundamental, a deposit into the wallet during operation is not recognized until restart.
I think all these should be fixed, and it makes sense to do it as one "project". I'd like to achieve these things:

  • Make wallet sync aggressive and optimistic and correct - so iron out any bugs, go with the approach used in --fast sync of assuming "normal" conditions and make it always work, first time, then - for the restarting-on-new-Core-wallet case, we can still keep the more clunky fallback.
  • Create a better/sharper interface between the client apps and the blockchain via the wallet. So that only the wallet will be affected by future changes (I'm thinking particularly of the possible future BIP157/8.
    usage)
  • Better asynchronous update. Let the clients (including the GUI as another type of client) subscribe to events, in particular changes to the wallet's utxo set. This can be made generic so the way it's done to solve the above mentioned "dynamic deposit" case works the same as for taker and makers watching for completions of coinjoins.

The first pushed commit here is something fairly minor, but part of the clean up: removed duplicated import_new_addresses functions and replaced them with an atomically in-wallet call, so that we can "guarantee" that all addresses used during JM operation have been imported.

@AdamISZ
Copy link
Member Author

AdamISZ commented Jun 15, 2019

As of 8240655 I can report that I no longer have the problem of "need to do multiple restarts" from my ongoing mainnet testing.

The change is explained in some detail in the commit message.
However I did neglect to mention one other minor point there, so I'll summarise here:

(1) save the wallet indices at the end of a successful sync, as otherwise a change in the index will not be persisted unless a transaction happens (meaning on restart, you have to go back to the previous, unsynced index).
(2) This is less obvious, but I decided to include a step of importing from (current index) to (current index + gaplimit) each time we request a new address. This is very cheap and it doesn't harm anything to import an address multiple times, and it avoids the problem that was causing unnecessarily-forced-restarts (see #328 - this specific commit should fix that), which was caused by the previous "import the new address" being insufficient for the sync, which requires everything imported up to the gap limit. Hope that's clear.

@kristapsk if you have a chance to test, let me know if this commit fixes sync for you on whatever big wallet you have around.

@undeath This isn't much more than a curiosity at this point, but don't you think these lines are un-needed?

@kristapsk
Copy link
Member

kristapsk commented Jun 15, 2019

https://github.com/JoinMarket-Org/joinmarket-clientserver/pull/359/files#diff-79627d21bc41aff521fe45af1ee3968aR560 breaks wallet-tool's history method and probably other read-only stuff, raises jmclient.storage.StorageError: Read-only storage cannot be saved.

@AdamISZ
Copy link
Member Author

AdamISZ commented Jun 15, 2019

@kristapsk yep i saw it earlier today also. Fix is in the next commit.

Basically, some wallet-tool methods are "noscan" and that's important so you can access the data without trying to sync the wallet (like showseed). We also had some additional methods marked 'read-only'; in my judgement it's better to not have that, so that the wallet's index (or sometimes called index_cache) can be updated after a sync, because otherwise if you run wallet-tool multiple times it'll have to redo all the same scanning work every time to update the index to the correct value.

So I removed the 'read-only' flag from runs of wallet-tool on the methods 'display' and similar (like 'showutxos') that must sync/scan the wallet/blockchain.

I judge there's no meaningful loss in security; this is not a long-running process like the maker or taker. You're still reading the secrets into memory, so there's no point blocking an update to the wallet file itself.

@kristapsk
Copy link
Member

There is big downside with removing read-only flag - now you can't run them while yield generator is running, because of wallet locks. So, it requires to shutdown yg, just to view history or list addresses (let's say, you want to send additional funds to your yg wallet) or do anything else with a specific wallet.

@AdamISZ
Copy link
Member Author

AdamISZ commented Jun 15, 2019

Ah gotcha. Will figure it out tomorrow then.

@AdamISZ
Copy link
Member Author

AdamISZ commented Jun 16, 2019

OK I'll just remove that index-saving code, it won't help anyway in normal conditions, it would only rarely make things more efficient, and I agree it's better not to screw up the locking function vs allowing read-only query.

@kristapsk
Copy link
Member

With 1e4e63c haven't found any new issues and it solves #328 for me.

@undeath
Copy link
Contributor

undeath commented Jun 18, 2019

@undeath This isn't much more than a curiosity at this point, but don't you think these lines are un-needed?

Those are an emergency fallback in case the wallet has been uncleanly closed (eg because of a crash). Removing those could cause jm to not recognise already used addresses, re-using them instead (and failing to display funds in there). Removing those would mean the user has no way to instruct jm to scan addresses beyond the wallet-saved index.

@AdamISZ AdamISZ mentioned this pull request Jul 15, 2019
@kristapsk
Copy link
Member

I am running this all the time and didn't got any "restart Bitcoin Core..." messages until today. Only thing different is that I deposited some funds into external address of a wallet. But I think I have done it before (like in a last month or so) too, but don't remember having that message. In any case, clearly an improvement, although if depositing funds can cause it, will not be ok for #336.

@AdamISZ
Copy link
Member Author

AdamISZ commented Aug 5, 2019

I am running this all the time and didn't got any "restart Bitcoin Core..." messages until today. Only thing different is that I deposited some funds into external address of a wallet. But I think I have done it before (like in a last month or so) too, but don't remember having that message. In any case, clearly an improvement, although if depositing funds can cause it, will not be ok for #336.

Well that's a disappointment. Clearly the intention was to prevent this; I'll try to reproduce, but pretty sure this was exactly the scenario I tested before (and indeed, both in testing and real world use I am not seeing this), so I don't know what causes it. If you can provide any more info on the circumstances it may help.

I'll get back on working on this whole attempted syncing revamp today, or this week, at least I'll try to make progress ...

Edit: hmm I see you mention #336 independently of #328 so I guess you mean, a deposit while JM is running, which is distinct I guess, from a deposit between runs. Maybe I didn't test that. Well, I am going to progress with earlier plans to address #336 anyway, so we'll have to wait to see if those plans actually solve this problem.

@kristapsk
Copy link
Member

deposit while JM is running, which is distinct I guess, from a deposit between runs

I was depositing while yg is running and then got error with wallet-tool, IIRC.

Will be mosly offline for about a week now, will be able to do more testing from my side next week.

@AdamISZ
Copy link
Member Author

AdamISZ commented Aug 6, 2019

So I've spent a day or more reviewing and analysing the syncing code as-is, and I've come to the following conclusions:

  • The example you give, @kristapsk , is valid (and I can also confirm in regtest); an external deposit will always cause this problem (not sure how I thought this wasn't the case before but it is).
  • The specific reasons for it are related to the assumption-free mode of operation of the default (non-fast) syncing.
  • This problem does not occur with the fast sync option. More on this below.

So, the result of the review and discussion is that I believe I made an error in not promoting the fast sync option, back at the time (years ago), to the default method of syncing. As the name suggests, it is indeed fast, as in a second or two even for very large/old wallets, but it is also more reliable - if certain, reasonable assumptions hold.

The approach behind fast sync:

We work on the assumption that all addresses that were ever used in the wallet label joinmarket-XX were those created by that Joinmarket wallet. Based on this, we go through these steps:

  • Collect all addresses ever used on that label with rpc listaddressgroupings
  • Additionally, record the current wallet index values per branch as per the wallet file, if they exist
  • Working in batches, search forward through the HD wallet branches, collecting instances of usage of addresses found in step 1.
  • Stop searching forward when all addresses identified in step 1 were matched to addresses in the JM HD wallet
  • Once finished, mark the wallet indices as the highest found used address or the current wallet indexes (from step 2), whichever is higher.

Notice how this leverages two basic assumptions - no "foreign" usage of the JM HD wallet's specific label, and the wallet branches are not used beyond batch size * num iterations - to make a very simple algorithm. With perhaps a bump to the latter (batch size maybe 200 should be fine), this should cover every situation where the wallet is created on one Core instance and then progressively used.

About imports:
A primary source of complexity is imports required on the 'lower level' Core wallet. In case of progressive usage on one Core instance, we only need to ensure that imports are effected up to the gap limit beyond whatever is the current index, which is now done with 1e4e63c . So any deposit by a user to an address which is actually displayed would be an address that's already imported. This addresses @kristapsk 's issue (note: it doesn't address it with the current default sync mode, see below).

About the detailed/non-fast sync:

The basic principle here is not to assume anything; it is specifically designed to address the needs of a recovery operation of a wallet on a new Core instance, where there is no wallet jmdat file and thus no JM HD wallet index, and there are no imports. It also does not assume that there are no usages of addresses beyond the gap limit, nor does it assume (or, let's say, use the assumption) that there are no usages of addresses at that label that are "foreign" to the JM HD wallet.

This set of assumptions, or rather non-assumptions both makes it slow and also makes it hard to reason about and unreliable. One example of that is what @kristapsk found above: when he deposited to a new address, it created a situation where the set of "gap addresses" (the gap_limit addresses on each branch beyond the currently set index) was bumped forwards and then included new, un-imported addresses, causing the triggering of this line, which shuts down after doing address imports, specifically because it does not 100% know that the newly imported addresses are unused (in fact, that assumption is nearly safe, so one could argue for changing this), and if they are used, then a rescan is needed.

The more general problematic symptom, which existed before 1e4e63c , was that restarts were needed after every transaction in the wallet. This was a very similar but not identical situation to the above: new transactions caused new imports and bumped the index, but the imports triggered did not include the whole of the gap addresses beyond that current index, and so this triggered a quit even earlier, here.

In case it wasn't clear, the only difference between these two situations was that the "deposit" scenario didn't bump the index, whereas a transaction did.

This is just a subset of the complexity here. It's important to note that not only is the second algorithm more complex, but that because it makes fewer assumptions it actually can often fail where the first algorithm succeeds, as well as being slower.

Proposed changes:

Make --fast the default. Note that it handles initial startup perfectly fine, see here. If a person continues to use Joinmarket normally on one Core instance, they should never need to do anything else, until they hit the limits hardcoded as batch_size and number of iterations (with the bump to 200 mentioned above, this should be at about 4000 addresses per branch). In any case, we can of course make sure that the error message clearly indicates:

  • If the batch limits are exceeded, clearly state it and suggest making a new wallet or giving an option to bump
  • Important message displayed prominently, always: if the balance shown is 0 or less than expected, you should try to sync in "recovery" mode

Then also: rename the current default sync mode to "recovery" mode, as that will be its primary use case.

All of the above is about sync, not about wallet monitoring which should be addressed separately.

Ping @undeath @adlai @chris-belcher @AlexCato just in case you have thoughts on this.

@AlexCato
Copy link
Contributor

AlexCato commented Sep 4, 2019

Proposed changes:

Make --fast the default. Note that it handles initial startup perfectly fine, see here. If a person continues to use Joinmarket normally on one Core instance, they should never need to do anything else, until they hit the limits hardcoded as batch_size and number of iterations (with the bump to 200 mentioned above, this should be at about 4000 addresses per branch). In any case, we can of course make sure that the error message clearly indicates:

* If the batch limits are exceeded, clearly state it and suggest making a new wallet or giving an option to bump

* Important message displayed prominently, always: if the balance shown is 0 or less than expected, you should try to sync in "recovery" mode

Then also: rename the current default sync mode to "recovery" mode, as that will be its primary use case.

I approve of all these changes and would be happy to see them. I guess most users are unaware of --fast and barely read any documentation. I've lately forgot --fast sometimes, and boy that does take forever on a wallet that has seen some use. So it's a very userfriendly change to make --fast the default; also renaming the other one to recovery is good.

@AdamISZ
Copy link
Member Author

AdamISZ commented Sep 25, 2019

This was force-pushed today. There is already here a major refactoring, and I am increasingly confident that it's a very useful change. Since it's very big, here's a summary:

  • About 400 LOC removed from blockchaininterface. They are: (1) the various collect_addresses* and rewind_indices* functions which are moved into the wallet, as they are pure wallet functions, and (2) removal of 3 very large functions which were used to monitor transactions (add_tx_notify, tx_watcher and outputs_watcher). For replacement of those, see:
  • New module in jmclient called wallet_service.py which houses an object WalletService. This has a couple of functions: to encapsulate access to the wallet (theoretically it could allow use of multiple wallets simultaneously, even) and better define an interface, and also encapsulates some access to the blockchaininterface, although currently not all. More importantly though, perhaps, is having a central monitoring loop for transactions that can be effectively subscribed to (see register_callbacks comments) to get updated on new transactions and confirmations, as before, but in a slightly less complex way. This architecture has lessened the rather painfully twisty cross referencing from client to bci to wallet, back to bci etc etc.
  • Further: Now clients (Takers, Makers) do not need to manually call add/remove utxos, this happens automatically within the wallet service, i.e. the utxo set is updated as part of the transaction monitoring loop which is always-on. This last point is crucial because it makes it dead simple to keep the Qt in-sync with any external transactions (deposits etc.) as is kind of required for a decent GUI wallet. (also long running ygs should be able to use deposits, same principle).
  • Some changes to the wallet in line with the above - a new method process_new_tx encapsulates the above, so that the utxo set is always updated with transactions that are relevant to the wallet (and any that are not are just ignored).
  • Access to the wallet methods is now almost exclusively via the WalletService interface, even where this means trivial pass-through functions (I'm currently vacillating on whether that should be 100% strict - probably, yes - but we can always change it later since it's just refactoring).

Avoiding getting lost in detail by listing more, but next steps:

  • Main CLI scripts working, but need to update the lesser CLI tools.
  • Update joinmarket-qt.py, including dynaming refresh of wallet balance
  • Update tests

None of these should be particularly hard at all, but they'll take some days. I'll also need to rebase of course. I expect this to be done within a week, probably. Feedback would be appreciated, especially if you can test after I've fixed the tests (dubious value to test it now).

@AdamISZ
Copy link
Member Author

AdamISZ commented Sep 25, 2019

Syncing:

The above commit adds the "default to fast sync" as previously discussed. Otherwise, the syncing methods are largely unchanged (except that you don't have to call them directly in user-level scripts; the WalletService does that for you).

I should also add that these changes have not addressed recently discovered problems with regard to imports, but they provide a route to doing so - we can add periodically woken up calls to do imports in small batches (not exactly a beautiful thing, but we seem to have no choice given what is effectively a bug in Core for slow HDs).

@AdamISZ
Copy link
Member Author

AdamISZ commented Sep 30, 2019

I have removed this from WIP status because I think a natural end point has been reached that achieves these basic goals (not entirely what I set out, but a very significant chunk of improvement):

  • Both yield generators and GUI will automatically recognize deposits (and in the latter case, show them); note how this happens now without code explicitly, because the wallet is constantly updated via the WalletService.
  • Fast sync is now default, which will take away all the problems for normal operations afaict
  • The wallet service architecture just uses one transaction monitoring loop, and much of the complexity in the blockchain interface code has been removed.
  • The wallet is now almost entirely encapsulated within the wallet service, which could lead to smarter ways of handling wallets in future (including making entirely new wallet types with no impact at higher levels.

There are other minor things, notably the Qt Wallet tab properly respecting expansion states, and a few small refactorings.

What has not been done is to cache importing, but I believe this will be much easier with the WalletService architecture.

  • Testing

What has been tested:
Apart from the basic test suite, I have tested all of these on regtest:
wallet-tool display, showutxos, sync, showseed, generate
tumbler basic runs
sendpayment direct send and join
tested timeout functions
Qt: wallet creation and syncing, deposits
yieldgenerator funcion

TODO: keep testing as many functions as possible. I will test payjoin (although it runs in functional tests), and I will keep running from Maker and Taker on mainnet. So far I have run yieldgenerators and seen transactions go through, but it needs quite a bit more.

So I would very much appreciate people running this in various ways, especially on mainnet. Since it's a broad change it needs a fair bit, there will doubtless be one or two more bugs, but I expect them to be minor at this point.

(Will squash at end but left commits for now in case it's easier to read)

@AdamISZ AdamISZ changed the title [WIP] Improvements to wallet syncing and monitoring Improvements to wallet syncing and monitoring Sep 30, 2019
@undeath
Copy link
Contributor

undeath commented Oct 2, 2019

This looks like a significant improvement over the kind of syncing done right now, thanks for this! I plan on reviewing and testing this code in the near future.

When I start jm I see a bunch of "saw tx on network" messages with this code, which is a little confusing.

@AdamISZ
Copy link
Member Author

AdamISZ commented Oct 2, 2019

Thanks.

When I start jm I see a bunch of "saw tx on network" messages with this code, which is a little confusing.

Yes you're right. I haven't got round to it, it's just an artifact of setting up the monitoring loop by "caching" the current set of returns from listtransactions. Will do.

@AdamISZ
Copy link
Member Author

AdamISZ commented Oct 22, 2019

@chris-belcher @AlexCato thanks. @kristapsk - It's been a while, have you checked again and specifically were you running dad8bcc ?

@chris-belcher
Copy link
Contributor

Yes I was running that commit.

@AdamISZ
Copy link
Member Author

AdamISZ commented Oct 22, 2019

@chris-belcher I think you misread :) I wasn't asking you that, only @kristapsk specifically - and that, because what he described on IRC at the time led to me to suspect that a code divergence may have caused the crash he experienced. For your case, I don't find it surprising that there may be such an error considering I haven't re-tested manual recovery since the latest changes, so I'll just repro and fix it today, most likely.

@AdamISZ
Copy link
Member Author

AdamISZ commented Oct 22, 2019

So 3a60c67 should fix that.

It came from me moving code around while refactoring. Now it behaves as before: if there is no restart callback specified, it defaults to command line printout. The bug wasn't specific to recovery (as is I guess obvious) (uh ... well; recovery or new wallet, I guess; anyway, it was a bug).

A reminder that --recoversync should be used when recovering on a new instance (as above, way above, in the scrollback, we now use fast as default and what was default, is called 'recoversync').

@chris-belcher
Copy link
Contributor

chris-belcher commented Oct 22, 2019

With that new commit it correctly recovered a mainnet wallet from seed phrase.

Edit: also just tested by using sendpayment.py with my own yield-gens on regtest. It seems to all work correctly, including only announcing confirmed funds.

@kristapsk
Copy link
Member

@AdamISZ , no, was busy with other stuff and now I'm travelling this week, so cannot promise anything in coming days.

@AdamISZ
Copy link
Member Author

AdamISZ commented Oct 22, 2019

@kristapsk understood; thanks for letting me know.

@AdamISZ
Copy link
Member Author

AdamISZ commented Oct 23, 2019

@kristapsk fwiw tested summary and seems fine. And I have one more comment after our long discussion yesterday: on the one hand you said that showutxos returned {}, but a little further on you said, that you saw one address with coins in mixdepth 4. That's particularly weird, surely something conflicts there. (Edit: delete that, I read more carefully: you said the coin was there, as reported in the old version of the code, you don't see it in the new version: that's far less strange! Sorry). I still don't have good ideas, maybe try to recover the wallet with a new rpc_wallet_file and see if that fixes it. So far I haven't found a reproduction of your problem (or a way to do it), including @AlexCato saying he is able to sync large/old wallets.

Introduces WalletService object which is in control of
blockchain and wallet access.
The service manages a single transaction monitoring loop,
instead of multiple, and allows updates to the wallet from
external sources to be handled in real time, so that both Qt
and other apps (yg) can respond to deposits or withdrawals
automatically.
The refactoring also controls access to both wallet and
blockchain so that client apps (Taker, Maker) will not need
to be changed for future new versions e.g. client-side filtering.
Also updates and improves Wallet Tab behaviour in Qt (memory
of expansion state).
Additionally, blockchain sync is now --fast by default, with
the former default of detailed sync being renamed --recoversync.
AdamISZ added a commit that referenced this pull request Oct 25, 2019
c654de0 Wallet and blockchain refactoring (AdamISZ)
@AdamISZ AdamISZ merged commit c654de0 into master Oct 25, 2019
@AdamISZ AdamISZ mentioned this pull request Oct 26, 2019
@AdamISZ AdamISZ deleted the walletservice branch November 4, 2019 13:24
@AdamISZ AdamISZ mentioned this pull request Dec 10, 2019
AdamISZ added a commit that referenced this pull request Jan 3, 2020
In refactor for #359 it was noted that the wallet
monitoring loop in Qt updated the status bar every
5 seconds, overwriting any existing status updates.
This fixes that UI bug so that the wallet synced
successfully (or unsucessfully) message is only shown
at start up or if there is a change of status (i.e.
the wallet monitoring loop stops working).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants