Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wallet: cache descriptor ID to avoid repeated descriptor string creation #28799

Conversation

theStack
Copy link
Contributor

@theStack theStack commented Nov 5, 2023

Right now a wallet descriptor is converted to its string representation (via Descriptor::ToString) repeatedly at different instances:

  • on finding a DescriptorScriptPubKeyMan for a given descriptor (CWallet::GetDescriptorScriptPubKeyMan, e.g. used by the importdescriptors RPC); the string representation is created once for each spkm in the wallet and at each iteration again for the searched descriptor (DescriptorScriptPubKeyMan::HasWalletDescriptor)
  • whenever DescriptorScriptPubKeyMan::GetID() is called, e.g. in TopUp or any instances where a descriptor is written to the DB to determine the database key, also at less obvious places like FastWalletRescanFilter etc.

As there is no good reason to calculate a fixed descriptor's string/ID more than once, add the ID as a field to WalletDescriptor and calculate it immediately at initialization (or deserialization). HasWalletDescriptor is changed to compare the spkm's and searched descriptor's ID instead of the string to take use of that.

This speeds up the functional test wallet_miniscript.py by a factor of 5-6x on my machine (3m30.95s on master vs. 0m38.02s on PR). The recently introduced "max-size TapMiniscript" test-case introduced a descriptor that takes 2-3 seconds to create a string representation, so the repeated calls to that were significantly hurting the performance.

Fixes #28800.

Right now a wallet descriptor is converted to it's string representation
(via `Descriptor::ToString`) repeatedly at different instances:
- on finding a `DescriptorScriptPubKeyMan` for a given descriptor
  (`CWallet::GetDescriptorScriptPubKeyMan`, e.g. used by the
  `importdescriptors` RPC); the string representation is created once
  for each spkm in the wallet and at each iteration again for
  the searched descriptor (`DescriptorScriptPubKeyMan::HasWalletDescriptor`)
- whenever `DescriptorScriptPubKeyMan::GetID()` is called, e.g. in
  `TopUp` or any instances where a descriptor is written to the DB
  to determine the database key etc.

As there is no good reason to calculate a fixed descriptor's string/ID
more than once, add the ID as a field to `WalletDescriptor` and
calculate it immediately at initialization (or deserialization).
`HasWalletDescriptor` is changed to compare the spkm's and searched
descriptor's ID instead of the string to take use of that.

This speeds up the functional test `wallet_miniscript.py` by a factor of
5-6x on my machine (3m30.95s on master vs. 0m38.02s on PR). The recently
introduced "max-size TapMiniscript" test-case introduced a descriptor
that takes 2-3 seconds to create a string representation, so the
repeated calls to that were significantly hurting the performance.
@DrahtBot
Copy link
Contributor

DrahtBot commented Nov 5, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK Sjors, S3RK, BrandonOdiwuor, achow101

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #28710 (Remove the legacy wallet and BDB dependency by achow101)
  • #22838 (descriptors: Be able to specify change and receiving in a single descriptor string by achow101)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@Sjors
Copy link
Member

Sjors commented Nov 6, 2023

This fixes #28800 for me and runs way faster under --with-sanitizers=thread.

ACK 5e6bc6d

@S3RK
Copy link
Contributor

S3RK commented Nov 6, 2023

Code Review ACK 5e6bc6d

The change looks correct. I also verified that when we update descriptors with importdescriptor command we replace the whole WalletDescriptor object, so ID remains correct in that case.

Copy link
Contributor

@BrandonOdiwuor BrandonOdiwuor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 5e6bc6d

looks good to me

@achow101
Copy link
Member

achow101 commented Nov 6, 2023

ACK 5e6bc6d

@achow101 achow101 merged commit 0f5e31c into bitcoin:master Nov 6, 2023
16 checks passed
@theStack theStack deleted the 202311-wallet-avoid_repeated_desc_str_id_calculation branch November 7, 2023 18:18
luke-jr pushed a commit to bitcoinknots/bitcoin that referenced this pull request Nov 9, 2023
Right now a wallet descriptor is converted to it's string representation
(via `Descriptor::ToString`) repeatedly at different instances:
- on finding a `DescriptorScriptPubKeyMan` for a given descriptor
  (`CWallet::GetDescriptorScriptPubKeyMan`, e.g. used by the
  `importdescriptors` RPC); the string representation is created once
  for each spkm in the wallet and at each iteration again for
  the searched descriptor (`DescriptorScriptPubKeyMan::HasWalletDescriptor`)
- whenever `DescriptorScriptPubKeyMan::GetID()` is called, e.g. in
  `TopUp` or any instances where a descriptor is written to the DB
  to determine the database key etc.

As there is no good reason to calculate a fixed descriptor's string/ID
more than once, add the ID as a field to `WalletDescriptor` and
calculate it immediately at initialization (or deserialization).
`HasWalletDescriptor` is changed to compare the spkm's and searched
descriptor's ID instead of the string to take use of that.

This speeds up the functional test `wallet_miniscript.py` by a factor of
5-6x on my machine (3m30.95s on master vs. 0m38.02s on PR). The recently
introduced "max-size TapMiniscript" test-case introduced a descriptor
that takes 2-3 seconds to create a string representation, so the
repeated calls to that were significantly hurting the performance.

Github-Pull: bitcoin#28799
Rebased-From: f811a24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

test: wallet_miniscript.py fails with thread sanitizer
6 participants