Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wallet: Introduce WalletDatabase abstract class #19334

Merged
merged 4 commits into from Jul 23, 2020

Conversation

achow101
Copy link
Member

@achow101 achow101 commented Jun 20, 2020

A WalletDatabase abstract class is created from BerkeleyDatabase and is implemented by BerkeleyDatabase. First, to get to the point that this is possible, 4 functions need to be added to BerkeleyDatabase: AddRef, RemoveRef, Open, and Close.

First the increment and decrement of mapFileUseCount is refactored into separate functions AddRef and RemoveRef.

Open is introduced as a dummy function. This will raise an exception so that it always fails.

Close is refactored from Flush. The shutdown argument in Flush is removed and instead Flush(true) is now the Close function.

Split from #18971

Requires #19325

@DrahtBot
Copy link
Contributor

@DrahtBot DrahtBot commented Jun 20, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #19335 (wallet: Cleanup and separate BerkeleyDatabase and BerkeleyBatch by achow101)
  • #19101 (refactor: remove ::vpwallets and related global variables by ryanofsky)
  • #18904 (Don't call lsn_reset in periodic flush by bvbfan)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Code review ACK 4ea4ad9, just last 4 commits:

  • da7ac55 walletdb: Move BerkeleyDatabase::Flush(true) to Close() (7/10)
  • d34001d walletdb: Introduce AddRef and RemoveRef functions (8/10)
  • bb64a32 walletdb: Add BerkeleyDatabase::Open dummy function (9/10)
  • 4ea4ad9 walletdb: Introduce WalletDatabase abstract class (10/10)

These changes were originally implemented in #18971 and made bigger cleanups, but a lot of changes were reverted so the end result is messier but diffs are smaller, which is reasonable. Suggestions below are for future cleanups

src/wallet/bdb.cpp Show resolved Hide resolved
if (env) {
size_t erased = env->m_databases.erase(strFile);
assert(erased == 1);
env->m_fileids.erase(strFile);
Copy link
Contributor

@ryanofsky ryanofsky Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "walletdb: Move BerkeleyDatabase::Flush(true) to Close()" (da7ac55)

Note for future cleanup: I didn't test but it was reported that this erase fails sometimes #18971 (comment). This seems harmless but is also unexpected. Good to debug and fix in the future.

src/wallet/bdb.h Outdated
*/
void Flush(bool shutdown);
void Close();
Copy link
Contributor

@ryanofsky ryanofsky Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "walletdb: Move BerkeleyDatabase::Flush(true) to Close()" (da7ac55)

Note for future cleanup: Descriptions and names of Close and Flush here are misleading.

I'd expect Database::Flush to reliably flush data to disk like Batch::Flush does. But Database::Flush may or may not flush data in the current database, and may flush data in other databases, since what it actually does is iterate over all databases in the environment and close ones are not in use. If the current database is in use it does nothing.

Database::Close is similarly misnamed. It basically does the same thing as Database::Flush so it or may not close the current database. The only thing it does differently is call log_archive after and remove environment log files.

To clean up this mess, I would ban use of the word "Flush" and:

  • Rename Batch::Flush to Batch::Checkpoint and rename Database::Flush to Database::CloseIfUnused as suggested #18971 (comment)
  • Drop Database::Close and have a separate Archive method, or just archive logs automatically when the environment is closed as originally implemented in e172e78 #18971 but then reverted #18971 (comment)

Copy link
Member Author

@achow101 achow101 Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to keep the naming and move to make the functions actually do as the names suggest. Since the goal is to have a standard interface, changing the names to match the craziness we have for BDB seems counterproductive. For example, in #19077, Flush and Close do work as you would expect.

I think many of these issues stem from the fact that we allow multiple BerkeleyDatabases for one BerkeleyEnvironment. I think if we stopped doing that, we would be able to make Flush and Close behave in the ways that we expect.

g_dbenvs.erase(env->Directory().string());
env = nullptr;
} else {
// TODO: To avoid g_dbenvs.erase erasing the environment prematurely after the
Copy link
Contributor

@ryanofsky ryanofsky Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "walletdb: Move BerkeleyDatabase::Flush(true) to Close()" (da7ac55)

Note: TODO is removed here because it was previously implemented #18971 (comment)

env->Flush(shutdown);
if (shutdown) {
LOCK(cs_db);
g_dbenvs.erase(env->Directory().string());
Copy link
Contributor

@ryanofsky ryanofsky Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "walletdb: Move BerkeleyDatabase::Flush(true) to Close()" (da7ac55)

Note: Erase call was buggy here and is moved to destructor in this commit for reasons described #18971 (comment)

// environment, should replace raw database `env` pointers with shared or weak
// pointers, or else separate the database and environment shutdowns so
// environments can be shut down after databases.
env->m_fileids.erase(strFile);
Copy link
Contributor

@ryanofsky ryanofsky Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "walletdb: Move BerkeleyDatabase::Flush(true) to Close()" (da7ac55)

Note: m_fileids.erase is moving to ~Database destructor in this commit for reasons described #18971 (comment)

BerkeleyDatabase::~BerkeleyDatabase()
{
if (env) {
size_t erased = env->m_databases.erase(strFile);
Copy link
Contributor

@ryanofsky ryanofsky Jun 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "walletdb: Move BerkeleyDatabase::Flush(true) to Close()" (da7ac55)

Note for future cleanup, this destructor should be calling CloseDb and asserting !m_db here #18971 (comment) to make sure the database is actually closed

@achow101
Copy link
Member Author

@achow101 achow101 commented Jun 23, 2020

Many of the suggested cleanups are done in #19335 (as I suspect you already know)

Copy link
Member

@Sjors Sjors left a comment

utACK ae35e09 modulo lock, and pending base PR

Glad you moved the cleanup stuff to #19335.

In walletdb.h there's a comment: The following classes are implementation specific: BerkeleyEnvironment, BerkeleyDatabase, BerkeleyBatch. Maybe rename those?

src/wallet/bdb.cpp Show resolved Hide resolved
src/wallet/db.h Outdated Show resolved Hide resolved
achow101 added 4 commits Jul 14, 2020
Instead of having Flush optionally shutdown the database and
environment, add a Close() function that does that.
Refactor mapFileUseCount increment and decrement to separate functions
AddRef and RemoveRef
Adds an Open function for the class abstraction that does nothing for
now.
Make WalletDatabase actually an abstract class and not just a typedef
for BerkeleyDatabase. Have BerkeleyDatabase inherit this class.
Copy link
Contributor

@ryanofsky ryanofsky left a comment

Code review ACK d416ae5. Only changes since last review were rebasing after base PR #19334 merge, and adding cs_db lock in BerkeleyDatabase destructor, which should avoid races accessing env->m_databases and env->m_fileids

(Accidentally posted this review Wednesday in the wrong PR #19335 (review))

Copy link
Contributor

@fjahr fjahr left a comment

Code review ACK d416ae5

  • some very minor manual testing

@@ -121,6 +121,11 @@ class BerkeleyDatabase
*/
bool Rewrite(const char* pszSkip=nullptr);

/** Indicate the a new database user has began using the database. */
Copy link
Contributor

@fjahr fjahr Jul 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit for a follow-up:

Suggested change
/** Indicate the a new database user has began using the database. */
/** Indicate that a new database user has begun using the database. */

Copy link
Contributor

@meshcollider meshcollider left a comment

Code review & test run ACK d416ae5

@meshcollider meshcollider merged commit 9d4b3d8 into bitcoin:master Jul 23, 2020
4 checks passed
@meshcollider meshcollider moved this from PRs to Design in Sqlite wallets Jul 23, 2020
@meshcollider meshcollider moved this from Design to Done in Sqlite wallets Jul 23, 2020
@meshcollider meshcollider removed this from Blockers in High-priority for review Jul 23, 2020
sidhujag pushed a commit to syscoin/syscoin that referenced this issue Jul 24, 2020
d416ae5 walletdb: Introduce WalletDatabase abstract class (Andrew Chow)
2179dbc walletdb: Add BerkeleyDatabase::Open dummy function (Andrew Chow)
71d28e7 walletdb: Introduce AddRef and RemoveRef functions (Andrew Chow)
27b2766 walletdb: Move BerkeleyDatabase::Flush(true) to Close() (Andrew Chow)

Pull request description:

  A `WalletDatabase` abstract class is created from `BerkeleyDatabase` and is implemented by `BerkeleyDatabase`. First, to get to the point that this is possible, 4 functions need to be added to `BerkeleyDatabase`: `AddRef`, `RemoveRef`, `Open`, and `Close`.

  First the increment and decrement of `mapFileUseCount` is refactored into separate functions `AddRef` and `RemoveRef`.

  `Open` is introduced as a dummy function. This will raise an exception so that it always fails.

  `Close` is refactored from `Flush`. The `shutdown` argument in `Flush` is removed and instead `Flush(true)` is now the `Close` function.

  Split from bitcoin#18971

  Requires bitcoin#19325

ACKs for top commit:
  ryanofsky:
    Code review ACK d416ae5. Only changes since last review were rebasing after base PR bitcoin#19334 merge, and adding cs_db lock in BerkeleyDatabase destructor, which should avoid races accessing env->m_databases and env->m_fileids
  fjahr:
    Code review ACK d416ae5
  meshcollider:
    Code review & test run ACK d416ae5

Tree-SHA512: 98d05ec093d7446c4488e2b0914584222a331e9a2f4d5be6af98e3f6d78fdd8e75526c12f91a8a52d4820c25bce02aa02aabe92d38bee7eb2fce07d0691b7b0d
laanwj added a commit that referenced this issue Jul 29, 2020
…leyBatch

74507ce walletdb: Remove BerkeleyBatch friend class from BerkeleyDatabase (Andrew Chow)
00f0041 No need to check for duplicate fileids in all dbenvs (Andrew Chow)
d86efab walletdb: Move Db->open to BerkeleyDatabase::Open (Andrew Chow)
4fe4b3b walletdb: track database file use as m_refcount within BerkeleyDatabase (Andrew Chow)
65fb880 Combine BerkeleyEnvironment::Verify into BerkeleyDatabase::Verify (Andrew Chow)

Pull request description:

  `BerkeleyBatch` and `BerkeleyDatabase` are kind of messy. The goal of this is to clean up them up so that they are logically separated.

  `BerkeleyBatch` currently handles the creation of the `BerkeleyDatabase`'s `Db` handle. This is instead moved into `BerkeleyDatabase` and is called by `BerkeleyBatch`.

  Instead of having `BerkeleyEnvironment` track each database's usage, have `BerkeleyDatabase` track this usage itself with the `m_refcount` variable that is present in `WalletDatabase`.

  Lastly, instead of having each `BerkeleyEnvironment` store the fileids of the databases open in it, have a global `g_fileids` to track those fileids. We were already checking fileid uniqueness globally (by checking the fileids in every environment when opening a database) so it's cleaner to do this with a global variable.

  All of these changes allow us to make `BerkeleyBatch` and `BerkeleyDatabase` no longer be friend classes.

  The diff of this PR is currently the same as in ##18971

  Requires #19334

ACKs for top commit:
  laanwj:
    Code review ACK 74507ce
  ryanofsky:
    Code review ACK 74507ce. No changes since last review other than rebase

Tree-SHA512: 845d84ee1a470e2bf5d2e2e3d7738183d8ce43ddd06a0bbd57edecf5779b2f55d70728b1b57f5daab0f078650a8d60c3e19dc30b75b36e7aa952ce268399d5f6
MarcoFalke added a commit that referenced this issue Jul 30, 2020
…y BerkeleyDatabase

0fcff54 walletdb: Ensure that having no database handle is a failure (Andrew Chow)
da039d2 Remove BDB dummy databases (Andrew Chow)
0103d64 Introduce DummyDatabase and use it in the tests (Andrew Chow)

Pull request description:

  In the unit tests, we use a dummy `WalletDatabase` which does nothing and always returns true. This is currently implemented by creating a `BerkeleyDatabase` in dummy mode. This PR instead adds a `DummyDatabase` class which does nothing and never fails for use in the tests. `CreateDummyWalletDatabase` is changed to return this `DummyDatabase` and `BerkeleyDatabase` is cleaned up to remove all of the checks for `IsDummy`.

  Based on `WalletDatabase` abstract class introduced in #19334

ACKs for top commit:
  instagibbs:
    utACK 0fcff54
  MarcoFalke:
    crACK 0fcff54 🚈

Tree-SHA512: 05fbf32e078753e9a55a05f4c080b6d365b909a2a3a8e571b7e64b59ebbe53da49394f70419cc793192ade79f312f5e0422ca7c261ba81bae5912671c5ff6402
sidhujag pushed a commit to syscoin/syscoin that referenced this issue Jul 31, 2020
…d BerkeleyBatch

74507ce walletdb: Remove BerkeleyBatch friend class from BerkeleyDatabase (Andrew Chow)
00f0041 No need to check for duplicate fileids in all dbenvs (Andrew Chow)
d86efab walletdb: Move Db->open to BerkeleyDatabase::Open (Andrew Chow)
4fe4b3b walletdb: track database file use as m_refcount within BerkeleyDatabase (Andrew Chow)
65fb880 Combine BerkeleyEnvironment::Verify into BerkeleyDatabase::Verify (Andrew Chow)

Pull request description:

  `BerkeleyBatch` and `BerkeleyDatabase` are kind of messy. The goal of this is to clean up them up so that they are logically separated.

  `BerkeleyBatch` currently handles the creation of the `BerkeleyDatabase`'s `Db` handle. This is instead moved into `BerkeleyDatabase` and is called by `BerkeleyBatch`.

  Instead of having `BerkeleyEnvironment` track each database's usage, have `BerkeleyDatabase` track this usage itself with the `m_refcount` variable that is present in `WalletDatabase`.

  Lastly, instead of having each `BerkeleyEnvironment` store the fileids of the databases open in it, have a global `g_fileids` to track those fileids. We were already checking fileid uniqueness globally (by checking the fileids in every environment when opening a database) so it's cleaner to do this with a global variable.

  All of these changes allow us to make `BerkeleyBatch` and `BerkeleyDatabase` no longer be friend classes.

  The diff of this PR is currently the same as in #bitcoin#18971

  Requires bitcoin#19334

ACKs for top commit:
  laanwj:
    Code review ACK 74507ce
  ryanofsky:
    Code review ACK 74507ce. No changes since last review other than rebase

Tree-SHA512: 845d84ee1a470e2bf5d2e2e3d7738183d8ce43ddd06a0bbd57edecf5779b2f55d70728b1b57f5daab0f078650a8d60c3e19dc30b75b36e7aa952ce268399d5f6
MarcoFalke added a commit that referenced this issue Dec 17, 2020
23cac24 tests: Test bitcoin-wallet dump and createfromdump (Andrew Chow)
a88c320 wallettool: Add createfromdump command (Andrew Chow)
e1e7a90 wallettool: Add dump command (Andrew Chow)

Pull request description:

  Adds two commands to the `bitcoin-wallet` tool: `dump` and `createfromdump`. These commands will be useful for a wallet storage migration in the future. It is also generally useful to have a storage agnostic dump like this. These commands are similar to BDB's `db_dump` and `db_load` tools. This can also be useful for manual construction of a wallet file for tests.

  `dump` outputs every key-value pair from the wallet as comma separated hex. Each key-value pair is on its own line with the key and value in hex separated by a comma. This is output to the file specified by the new `-dumpfile` option.

  `createfromdump` takes a file produced by `dump` and creates a new wallet file with exactly the records specified in that file.

  A new option `-dumpfile` is added to the wallet tool. When used with `dump`, the records will be written to the specified file. When used with `createfromdump`, the file is read and the key-value pairs constructed from it. `createfromdump` requires `-dumpfile`.

  A simple round-trip test is added to the `tool_wallet.py`.

  This PR is based on #19334,

ACKs for top commit:
  Sjors:
    re-utACK 23cac24
  MarcoFalke:
    re review ACK 23cac24 only change is rebase and removing useless shared_ptr wrapper 🎼
  ryanofsky:
    Code review ACK 23cac24. Only changes since last review rebase and changing a pointer to a reference

Tree-SHA512: 2d63cf62baca3d16495aa698dc02f7d889c81b41015e9c92c23c275bb4a690fc176d351c3fd7f310bd6b17f5a936cc9be694cbecd702af741b96c0f530e72fa2
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this issue Sep 2, 2021
Summary:
Instead of having Flush optionally shutdown the database and
environment, add a Close() function that does that.

This is a backport of [[bitcoin/bitcoin#19334 | core#19334]] [1/4]
bitcoin/bitcoin@27b2766

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, majcosta

Reviewed By: #bitcoin_abc, majcosta

Differential Revision: https://reviews.bitcoinabc.org/D10022
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this issue Sep 2, 2021
Summary:
Refactor mapFileUseCount increment and decrement to separate functions
AddRef and RemoveRef

This is a backport of [[bitcoin/bitcoin#19334 | core#19334]] [2/4]
bitcoin/bitcoin@71d28e7

Depends on D10022

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, majcosta

Reviewed By: #bitcoin_abc, majcosta

Differential Revision: https://reviews.bitcoinabc.org/D10023
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this issue Sep 2, 2021
Summary:
Adds an Open function for the class abstraction that does nothing for
now.

This is a backport of [[bitcoin/bitcoin#19334 | core#19334]] [3/4]
bitcoin/bitcoin@2179dbc

Depends on D10023

Test Plan: `ninja`

Reviewers: #bitcoin_abc, majcosta

Reviewed By: #bitcoin_abc, majcosta

Differential Revision: https://reviews.bitcoinabc.org/D10024
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this issue Sep 2, 2021
Summary:
Make WalletDatabase actually an abstract class and not just a typedef
for BerkeleyDatabase. Have BerkeleyDatabase inherit this class.

This is a backport of [[bitcoin/bitcoin#19334 | core#19334]] [4/4]
bitcoin/bitcoin@d416ae5

Depends on D100234

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, majcosta

Reviewed By: #bitcoin_abc, majcosta

Differential Revision: https://reviews.bitcoinabc.org/D10025
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

7 participants