Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Wallet] Worst case performance improvement on KeyPool filtering #10184

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
4 participants
@NicolasDorier
Copy link
Member

NicolasDorier commented Apr 11, 2017

#9294 introduced internal HD chain.
This made ReserveKeyFromPool iterates on the whole keypool (One disk access per key) to find a key matching internal.

This PR changes the way TopUpKeyPool creates keys in the pool.
Instead of creating all the external keys followed by all the internal keys (which would provoke lots of disk hit in ReserveKeyFromPool if internal is true), it alternates them.

@NicolasDorier NicolasDorier force-pushed the NicolasDorier:hdinternalperf branch Apr 11, 2017

@NicolasDorier NicolasDorier force-pushed the NicolasDorier:hdinternalperf branch to 153966c Apr 11, 2017

@TheBlueMatt

This comment has been minimized.

Copy link
Contributor

TheBlueMatt commented Apr 11, 2017

Isnt it a much simpler approach to just switch the keypool set to two separate sets instead? On wallet load we read each keypool entry from disk anyway, so just caching it differently should be trivial.

@NicolasDorier

This comment has been minimized.

Copy link
Member Author

NicolasDorier commented Apr 12, 2017

I think this would be the right long term approach. I don't think it is simple to change though. This PR is just some tape on the current approach.

Anyway, I take a look, maybe it is not as complicated as I think.

@fanquake fanquake added the Wallet label Apr 12, 2017

@jonasschnelli

This comment has been minimized.

Copy link
Member

jonasschnelli commented Apr 12, 2017

Long term, keypools do make little sense to me, in conjunction with HD. For encrypted wallets, we may want to disable any private key export function and use public key derivation (and remove both keypools). For the HD lookup window, we can generate (in memory) ~+50 keys during startup and extend it whenever the user retrieves a new address. IMO there is no need to persist the set.

I'm not sure if it's worth to optimise this further (this PR seems to be okay though).

@TheBlueMatt

This comment has been minimized.

Copy link
Contributor

TheBlueMatt commented Apr 12, 2017

@jonasschnelli I think we very much need to fix the performance regressions before 0.15. The introduction of many disk hits on some paths is potentially a huge issue for some users.

I'm slightly confused about this PR, however. Am I reading it correctly to think that its just a different way of identifying keys and the actual performance improvements will come in a later PR?

@NicolasDorier

This comment has been minimized.

Copy link
Member Author

NicolasDorier commented Apr 13, 2017

@TheBlueMatt I think this PR is enough to solve current inconvenience.

Before:

TopUp of 100 key will fill the keypool in the following way:

[external1][external2]...[external100][internal1][internal2]...[internal100]

Now if you call ReserveKeyFromPool with internal=true this create 100 disk hit.

This PR makes it so TopUp of 100 key fill the keypool in the following way

[external1][internal1][external2][internal2]...[external100][internal100]

Which mean that ReserveKeyFromPool will have 2 disk hits instead of 100.

@TheBlueMatt

This comment has been minimized.

Copy link
Contributor

TheBlueMatt commented Apr 13, 2017

@NicolasDorier

This comment has been minimized.

Copy link
Member Author

NicolasDorier commented Apr 13, 2017

@TheBlueMatt I don't think the perf regression is that bad. The only time there is a count is during TopUpKeyPool, which is slow anyway. With this PR the perf regression become +1 disk hit per call to ReserveKeyFromPool, which is not that bad.

@TheBlueMatt

This comment has been minimized.

Copy link
Contributor

TheBlueMatt commented Apr 13, 2017

@NicolasDorier There are also a performance regression in GetOldestKeyPoolTime(), making getinfo and getwalletinfo have to iterate the entire keypool reading each element from disk. Note that TopUpKeyPool is called in a bunch of places, and isnt supposed to always be slow, as it is often called to top up the keypool by generating one or zero keys.

@NicolasDorier

This comment has been minimized.

Copy link
Member Author

NicolasDorier commented Apr 14, 2017

@TheBlueMatt I think the problem of TopUpKeyPool is orthogonal to this PR. If we solve the TopUpKeyPool problem by caching the count, then we would still have the problem of 100 disk hit when creating internal key.

Except if we cache the keys of the pool itself, and not only the count.

@TheBlueMatt

This comment has been minimized.

Copy link
Contributor

TheBlueMatt commented Apr 14, 2017

@NicolasDorier yes, that was my original suggestion - split setKeyPool into two (since we already cache the sets of keys, might as well split them to keep extra information), and then you can fix TopUpKeyPool and all the other issues in one go :).

@jonasschnelli

This comment has been minimized.

Copy link
Member

jonasschnelli commented Apr 19, 2017

To add some benchmarks to those speculation:
System: host = OSX 2.9 GHz Intel Core i7, running a Ubuntu 16.10 VM with all 8 cores.

I'm not convinced that we have a performance issue there,.. though I'm not opposed against doing performance improvements like this PR.

0.14.1rc2 (non chain switch)

getwalletinfo with a single key

user@ubuntu:~/Desktop$ time ./bitcoin-0.14.1/bin/bitcoin-cli --regtest --datadir=/home/user/Desktop/n1 getwalletinfo
{
  "walletversion": 130000,
  "balance": 0.00000000,
  "unconfirmed_balance": 0.00000000,
  "immature_balance": 0.00000000,
  "txcount": 0,
  "keypoololdest": 1492595934,
  "keypoolsize": 1,
  "paytxfee": 0.00000000,
  "hdmasterkeyid": "a22d397ef2fb35d02ddbd096cd7f7f4c9dad189b"
}

real	0m0.004s
user	0m0.000s
sys	0m0.000s

top up keypool 1000 keys

user@ubuntu:~/Desktop$ time ./bitcoin-0.14.1/bin/bitcoin-cli --regtest --datadir=/home/user/Desktop/n1 keypoolrefill 1000

real	0m1.407s
user	0m0.000s
sys	0m0.000s

getwalletinfo (there are now 1000 keys)

user@ubuntu:~/Desktop$ time ./bitcoin-0.14.1/bin/bitcoin-cli --regtest --datadir=/home/user/Desktop/n1 getwalletinfo
{
  "walletversion": 130000,
  "balance": 0.00000000,
  "unconfirmed_balance": 0.00000000,
  "immature_balance": 0.00000000,
  "txcount": 0,
  "keypoololdest": 1492595934,
  "keypoolsize": 1001,
  "paytxfee": 0.00000000,
  "hdmasterkeyid": "a22d397ef2fb35d02ddbd096cd7f7f4c9dad189b"
}

real	0m0.012s
user	0m0.000s
sys	0m0.012s

topup a single key

user@ubuntu:~/Desktop$ time ./bitcoin-0.14.1/bin/bitcoin-cli --regtest --datadir=/home/user/Desktop/n1 keypoolrefill 1001

real	0m0.007s
user	0m0.000s
sys	0m0.000s

With current master (build yesterday) https://bitcoin.jonasschnelli.ch/build/102

getwalletinfo with a single key

user@ubuntu:~/Desktop$ time ./bitcoin-0.14.1/bin/bitcoin-cli --regtest --datadir=/home/user/Desktop/n2 getwalletinfo
{
  "walletversion": 139900,
  "balance": 0.00000000,
  "unconfirmed_balance": 0.00000000,
  "immature_balance": 0.00000000,
  "txcount": 0,
  "keypoololdest": 1492596058,
  "keypoolsize": 0,
  "keypoolsize_hd_internal": 1,
  "paytxfee": 0.00000000,
  "hdmasterkeyid": "658072512a5a0a71cdbe16a7dca1a55ac8ab82fe"
}

real	0m0.003s
user	0m0.000s
sys	0m0.000s

topupkeypool with 1000x2 (now we have also 1000 internal keys)

user@ubuntu:~/Desktop$ time ./bitcoin-0.14.1/bin/bitcoin-cli --regtest --datadir=/home/user/Desktop/n2 keypoolrefill 1000

real	0m2.752s
user	0m0.000s
sys	0m0.000s

getwalletinfo now with 1000x2 keys

user@ubuntu:~/Desktop$ time ./bitcoin-0.14.1/bin/bitcoin-cli --regtest --datadir=/home/user/Desktop/n2 getwalletinfo
{
  "walletversion": 139900,
  "balance": 0.00000000,
  "unconfirmed_balance": 0.00000000,
  "immature_balance": 0.00000000,
  "txcount": 0,
  "keypoololdest": 1492596074,
  "keypoolsize": 1000,
  "keypoolsize_hd_internal": 1000,
  "paytxfee": 0.00000000,
  "hdmasterkeyid": "658072512a5a0a71cdbe16a7dca1a55ac8ab82fe"
}


real	0m0.008s
user	0m0.000s
sys	0m0.000s

topup a single key (actually two keys one internal one external)

user@ubuntu:~/Desktop$ time ./bitcoin-0.14.1/bin/bitcoin-cli --regtest --datadir=/home/user/Desktop/n2 keypoolrefill 1001

real	0m0.012s
user	0m0.000s
sys	0m0.000s

getwalletinfo

user@ubuntu:~/Desktop$ time ./bitcoin-0.14.1/bin/bitcoin-cli --regtest --datadir=/home/user/Desktop/n2 getwalletinfo
{
  "walletversion": 139900,
  "balance": 0.00000000,
  "unconfirmed_balance": 0.00000000,
  "immature_balance": 0.00000000,
  "txcount": 0,
  "keypoololdest": 1492596074,
  "keypoolsize": 1001,
  "keypoolsize_hd_internal": 1001,
  "paytxfee": 0.00000000,
  "hdmasterkeyid": "658072512a5a0a71cdbe16a7dca1a55ac8ab82fe"
}

real	0m0.008s
user	0m0.000s
sys	0m0.000s
@NicolasDorier

This comment has been minimized.

Copy link
Member Author

NicolasDorier commented Apr 19, 2017

@jonasschnelli what disk drive are you using, SSD? I fear this has way more impact on HD or network attached disk which suffer high latency.

@jonasschnelli

This comment has been minimized.

Copy link
Member

jonasschnelli commented Apr 19, 2017

@NicolasDorier: Yes. It was SSD. NAS for key storage? Is that a reasonable use-case? I will redo the tests on an external USB 2 spinning disk a.s.a.p.

@NicolasDorier

This comment has been minimized.

Copy link
Member Author

NicolasDorier commented Apr 19, 2017

@jonasschnelli my bitcoin instances are hosted on Azure for example, attached data storages drives have kind of bad latency. Around 7ms if I remember.

Let me know, if it is not so bad with USB, I will also try on an azure instance.

@TheBlueMatt

This comment has been minimized.

Copy link
Contributor

TheBlueMatt commented Apr 19, 2017

@jonasschnelli sadly many, many servers use SANs for storage to avoid relying on a single server's storage infrastructure.

@jonasschnelli

This comment has been minimized.

Copy link
Member

jonasschnelli commented Apr 19, 2017

I now connected a relatively old 2.5" HDD (1TB, spinning) to my Host (OSX) and made it available via folder sharing.

I can't see even a mild performance issue.

The results are:

0.14.1rc2

getwalletinfo with a single key keypool:

time ./bitcoin-0.14.1/bin/bitcoin-cli --regtest --datadir=/mnt/hgfs/dummy/n1 getwalletinfo
{
  "walletversion": 130000,
  "balance": 0.00000000,
  "unconfirmed_balance": 0.00000000,
  "immature_balance": 0.00000000,
  "txcount": 0,
  "keypoololdest": 1492614680,
  "keypoolsize": 1,
  "paytxfee": 0.00000000,
  "hdmasterkeyid": "07daa6d010bdc3c756d58acafce93ac322d66df1"
}

real	0m0.006s
user	0m0.000s
sys	0m0.004s

topup keypool to 1000 keys

time ./bitcoin-0.14.1/bin/bitcoin-cli --regtest --datadir=/mnt/hgfs/dummy/n1 keypoolrefill 1000

real	0m4.609s
user	0m0.000s
sys	0m0.000s

getwalletinfo on a 1000 keys keypool

time ./bitcoin-0.14.1/bin/bitcoin-cli --regtest --datadir=/mnt/hgfs/dummy/n1 getwalletinfo
{
  "walletversion": 130000,
  "balance": 0.00000000,
  "unconfirmed_balance": 0.00000000,
  "immature_balance": 0.00000000,
  "txcount": 0,
  "keypoololdest": 1492614680,
  "keypoolsize": 1001,
  "paytxfee": 0.00000000,
  "hdmasterkeyid": "07daa6d010bdc3c756d58acafce93ac322d66df1"
}

real	0m0.009s
user	0m0.000s
sys	0m0.000s

topup and add a single key (= 1001 keys)

time ./bitcoin-0.14.1/bin/bitcoin-cli --regtest --datadir=/mnt/hgfs/dummy/n1 keypoolrefill 1001

real	0m0.010s
user	0m0.004s
sys	0m0.000s

with nightly build from yesterday

### getwalletinfo single key keypool

time ./bitcoin-0.14.1/bin/bitcoin-cli --regtest --datadir=/mnt/hgfs/dummy/n2 getwalletinfo
{
  "walletversion": 139900,
  "balance": 0.00000000,
  "unconfirmed_balance": 0.00000000,
  "immature_balance": 0.00000000,
  "txcount": 0,
  "keypoololdest": 1492614816,
  "keypoolsize": 0,
  "keypoolsize_hd_internal": 1,
  "paytxfee": 0.00000000,
  "hdmasterkeyid": "aaca57ac9a28ad0097a6b4d9f256e9a8ca02b16a"
}

real	0m0.004s
user	0m0.000s
sys	0m0.000s

### topup keypool (1000 keys each == 2000 keys)

time ./bitcoin-0.14.1/bin/bitcoin-cli --regtest --datadir=/mnt/hgfs/dummy/n2 keypoolrefill 1000

real	0m4.367s
user	0m0.000s
sys	0m0.000s

getwalletinfo on a 1000+1000 keys keypool

time ./bitcoin-0.14.1/bin/bitcoin-cli --regtest --datadir=/mnt/hgfs/dummy/n2 getwalletinfo
{
  "walletversion": 139900,
  "balance": 0.00000000,
  "unconfirmed_balance": 0.00000000,
  "immature_balance": 0.00000000,
  "txcount": 0,
  "keypoololdest": 1492614832,
  "keypoolsize": 1000,
  "keypoolsize_hd_internal": 1000,
  "paytxfee": 0.00000000,
  "hdmasterkeyid": "aaca57ac9a28ad0097a6b4d9f256e9a8ca02b16a"
}

real	0m0.047s
user	0m0.000s
sys	0m0.000s

topup a single key

time ./bitcoin-0.14.1/bin/bitcoin-cli --regtest --datadir=/mnt/hgfs/dummy/n2 keypoolrefill 1001

real	0m0.020s
user	0m0.000s
sys	0m0.000s
@jonasschnelli

This comment has been minimized.

Copy link
Member

jonasschnelli commented Apr 19, 2017

@TheBlueMatt @NicolasDorier:
SAN for servers make sense. But here we talk about key material, maybe even secret key material.
Storing something like this on an Azure system or VPS SAN can be reckless.
What would be the worst case if the SAN/Cloud provider could inject/alter the pubkeys during fetch?
Could an attacker re-route incoming funds?

IMO full nodes (especially those with --enable-wallet) and Azure, AWS, etc. are fundamentally incompatible.

I see reasons to move block files or to a SAN, but what could be the reason to move the wallet BerkleyDB to a NAS? If you have no other options you are very likely doing something very insecure.

@TheBlueMatt

This comment has been minimized.

Copy link
Contributor

TheBlueMatt commented Apr 19, 2017

@jonasschnelli see #10235, I think the changes are pretty simple, so still worth doing. As for discouraging use on remote hosts, I dont buy that argument. What if someone buys insurance or has little money on it? Doesn't mean it shouldnt perform.

@NicolasDorier

This comment has been minimized.

Copy link
Member Author

NicolasDorier commented Apr 20, 2017

@jonasschnelli I think the perf are not that bad actually. Will review #10235. It does not seems that bad to fix properly.

I am using Bitcoin Core more for coin tracking with watch only and the hdwatchonly.

When it is not the case, it really depends on how much money are stored on it, versus the cost and inconvenience of hosting my own physical server.

@NicolasDorier

This comment has been minimized.

Copy link
Member Author

NicolasDorier commented Apr 20, 2017

Closing in favor of #10235 which solve the problem completely, and simplify things.

sipa added a commit that referenced this pull request Jul 15, 2017

Merge #10235: Track keypool entries as internal vs external in memory
d40a72c Clarify *(--.end()) iterator semantics in CWallet::TopUpKeyPool (Matt Corallo)
28301b9 Meet code style on lines changed in the previous commit (Matt Corallo)
4a3fc35 Track keypool entries as internal vs external in memory (Matt Corallo)

Pull request description:

  This is an alternative version of #10184. As @jonasschnelli points out there, the performance regressions are pretty minimal, but given that this is a pretty simple, mechanical change, its probably worth doing.

Tree-SHA512: e83f9ebf2998f8164d1b2eebe5e6dcdeadea8c30b7612861f830758c08bf4093cd6a67b3bcfa9cfcb139e5e0b106fc8898a975fc69f334981aefc756568ab613
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.