[29.x] Backport wallets directory deletion fixes#34222
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34222. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please copy-paste |
|
ACK 5b8e09b |
5b8e09b to
c6877e3
Compare
Github-Pull: bitcoin#31423 Rebased-From: d04f6a9
c6877e3 to
5a1e79d
Compare
|
@davidgumberg discovered that a form of the deletion bug is also reachable if the user migrates a wallet at a relative path, resulting the entire directory at that relative path being deleted on a migration failure. This is problematic when that relative path contains more than just a wallet file. For example, with a wallet.dat in the datadir, migrating the wallet named "../" would result in deleting the entire datadir. To resolve this, I've backported the entirety of #34156 and #34226 which adds tests for this case. These backports are largely unclean as well. |
5a1e79d to
1bfcac1
Compare
1bfcac1 to
3606219
Compare
| // Before deleting the wallet's directory, copy the backup file to the top-level wallets dir | ||
| fs::path temp_backup_location = fsbridge::AbsPathJoin(GetWalletDir(), backup_filename); | ||
| fs::copy_file(backup_path, temp_backup_location, fs::copy_options::none); | ||
| wallet_files_to_remove.insert(backup_path); |
There was a problem hiding this comment.
If the wallet originally was in the wallet dir directly, we'll copy it over itself and then delete it?
Why not just move the file here (and never delete the backup)?
There was a problem hiding this comment.
Doing that breaks a bunch of tests and it got increasingly complicated to fix.
There was a problem hiding this comment.
Actually, it looks like copy_file is throwing an exception: "filesystem error: cannot copy file: File exists"
So it's not deleting the backup, but neither is it doing the intended cleanup at all. Probably want to extend the test to verify the cleanup happens, and then fix this.
(to be clear, for test "Test failure during migration of wallet named: unnamed (default)")
There was a problem hiding this comment.
The tests already check that this error is being hit and that the cleanup does not occur.
However this was needed for the other tests where cleanup is occurring. It is not universally true that the cleanup does not occur. It should only be for the unnamed wallet and for wallets by absolute path.
| files.emplace_back(env->Directory() / ".walletlock"); | ||
| files.emplace_back(env->Directory() / "database" / "log.0000000001"); | ||
| files.emplace_back(env->Directory() / "database"); |
There was a problem hiding this comment.
If there are other wallets in the same directory, these may be needed to avoid corrupting them, and are not safe to delete with this wallet.
There was a problem hiding this comment.
Good point. Added a check that env->m_databases.size() == 1 to return these.
It's only the unnamed wallet and direct files in the walletsdir that can possibly run into this issue. For wallets in a subdirectory, it will be one wallet.dat per environment.
There was a problem hiding this comment.
m_databases only includes open wallets. Other wallets dependent on these files might not be open.
There was a problem hiding this comment.
Wallets are compacted on shutdown so they should not have any of these files.
This also matches the behavior of BerkeleyDatabase::Flush.
There was a problem hiding this comment.
Not sure if it worth discussing this further. The Files() implementation for BDB exists just for completeness, it is not being called anywhere during migration right now.
There was a problem hiding this comment.
it is not being called anywhere during migration right now.
Oh right, only SQLiteDatabase::Files() is called by migration. BerkeleyDatabase::Files() is only called by createfromdump's cleanup if the user creates into a BDB file.
There was a problem hiding this comment.
Wallets are compacted on shutdown so they should not have any of these files.
Only on clean shutdown. It's possible the node crashed or power failed, etc.
This also matches the behavior of BerkeleyDatabase::Flush.
Yes, this is an existing bug: bitcoinknots#242 (happy to PR here if there's interest)
There was a problem hiding this comment.
It would be nice to focus on the high priority issue first. The goal of the backport is to fix the top-level directory deletion issue. Any edge case during the createfromdump cleanup is really minimal in comparison. createfromdump is a niche external wallet utility intended for development mostly.
There was a problem hiding this comment.
Yes, this is an existing bug: bitcoinknots#242 (happy to PR here if there's interest)
Briefly glancing through that pr, I don't think it solves any of the issues discussed in this thread. According to BDB's docs, the only error that lsn_reset, txn_checkpoint, and log_archive can return is EINVAL, so I don't think adding error handling there is meaningfully useful. It looks like that behavior has also existed since the initial commit of this project.
As best as I can tell, there is no way to prevent BDB from possibly becoming corrupted when the log files are removed. Even with log_archive(DB_ARCH_REMOVE), it still keeps the most recent log file, and there does not seem to be any way to determine whether that log file is actually still useful. The only way to avoid corruption there is to never delete the log files. However, the purpose of deleting the log files is to allow people to move between BDB versions (or more specifically, between a self compiled binary against a different BDB version and the release binaries) without the log file incompatibility issue, and that is probably still useful behavior. This behavior has also existed for nearly 13 years now.
I think this corruption is extremely unlikely to occur anyways. I'm finding it hard to even trigger these corruptions even with knowing the conditions they should occur in. Given that BDB was removed from the project several months ago, I doubt there is any appetite to review fixes for these issues, especially as the versions they affect go EOL.
| /** Return path to main database filename */ | ||
| std::string Filename() override { return fs::PathToString(env->Directory() / m_filename); } | ||
|
|
||
| std::vector<fs::path> Files() override |
There was a problem hiding this comment.
In what scenario is .Files() ever called on a BerkeleyDatabase? Just createfromdump?
There was a problem hiding this comment.
It is now used by the changes to migration.
It's used by RestoreWallet for cleanup as well.
Github-Pull: bitcoin#31423 Rebased-From: 1de423e
Github-Pull: bitcoin#34215 Rebased-From: f78f6f1
Track what RestoreWallet creates so only those files and directories are removed during a failure and nothing else. Preexisting paths must be left untouched. Note: Using fs::remove_all() instead of fs::remove() in RestoreWallet does not cause any problems currently, but the change is necessary for the next commit which extends RestoreWallet to work with existing directories, which may contain files that must not be deleted. Github-Pull: bitcoin#34156 Rebased-From: 4ed0693
3606219 to
f159bb6
Compare
| # Check migration failure | ||
| mocked_time = int(time.time()) | ||
| self.master_node.setmocktime(mocked_time) | ||
| assert_raises_rpc_error(-4, "last wallet synchronisation goes beyond pruned data. You need to -reindex (download the whole blockchain again in case of pruned node)", self.master_node.migratewallet, wallet_name=wallet_name) | ||
| self.master_node.setmocktime(0) | ||
|
|
||
| # Verify the wallet is stil BDB and backup is there. | ||
| self.restart_node(0, ["-reindex", "-nowallet"]) | ||
| self.connect_nodes(0, 1) | ||
| self.master_node.loadwallet(wallet_name) | ||
| migrated = self.master_node.get_wallet_rpc(wallet_name) | ||
| info = migrated.getwalletinfo() | ||
| assert_equal(info["descriptors"], False) | ||
| assert_equal(info["format"], "bdb") | ||
| backup_path = self.master_node.wallets_path / wallet_name / f"{wallet_name}_{mocked_time}.legacy.bak" | ||
| assert backup_path.exists() |
There was a problem hiding this comment.
How is this test passing?
We first attempt to migrate the wallet and fail, then call loadwallet with the same bdb wallet when v29 has them disabled by default.
The master_node is not restarted with -deprecatedrpc=create_bdb so master_node.loadwallet should be throwing a "Build does not support Berkeley DB database format".
There was a problem hiding this comment.
-deprecatedrpc=create_bdb is only necessary for createwallet, not loadwallet.
| def unsynced_wallet_on_pruned_node_fails(self): | ||
| self.log.info("Test migration of an unsynced wallet on a pruned node fails gracefully") | ||
| wallet_name = "pruned" |
There was a problem hiding this comment.
In 4831bfa:
Why not the unnamed wallet? The error was there, not in a named one.
update: please see #34222 (comment)
There was a problem hiding this comment.
Because the unnamed wallet will hit the filesystem error and not the pruned error.
|
Created a branch with all fixes in separate commits. https://github.com/furszy/bitcoin-core/tree/pr34222
|
When migrating any legacy unnamed wallet, a failed migration would cause the cleanup logic to remove its parent directory. Since this type of legacy wallet lives directly in the main '/wallets/' folder, this resulted in unintentionally erasing all wallets, including the backup file. To be fully safe, we will no longer call `fs::remove_all`. Instead, we only erase the individual db files we have created, leaving everything else intact. The created wallets parent directories are erased only if they are empty. As part of this last change, `RestoreWallet` was modified to allow an existing directory as the destination, since we no longer remove the original wallet directory (we only remove the files we created inside it). This also fixes the restore of top-level default wallets during failures, which were failing due to the directory existence check that always returns true for the /wallets/ directory. This bug started after: bitcoin@f6ee59b Previously, the `fs::copy_file` call was failing for top-level wallets, which prevented the `fs::remove_all` call from being reached. Github-Pull: bitcoin#34156 Rebased-From: f4c7e28
f159bb6 to
367c140
Compare
|
Taken @furszy's suggestions |
| fs::copy_file(backup_path, temp_backup_location, fs::copy_options::none); | ||
| fs::rename(backup_path, temp_backup_location); |
There was a problem hiding this comment.
Note that by changing this line, the error that 29.x would previously hit on an unnamed wallet migration failure is removed, and would introduce the bug present on 30.0 here. However, this is safe to do because of the other changes in this commit which prevent the deletions.
367c140 to
6fb06b8
Compare
Verifies that a failed migration of the unnamed (default) wallet does not erase the main /wallets/ directory, and also that the backup file exists. Github-Pull: bitcoin#34156 Rebased-From: 36093bd
…rune failure The first test verifies that restoring into an existing empty directory or a directory with no .dat db files succeeds, while restoring into a dir with a .dat file fails. The second test covers restoring into the default unnamed wallet (wallet.dat), which also implicitly exercises the recovery path used after a failed migration. The third test covers failure during restore on a prune node. When the wallet last sync was beyond the pruning height. Github-Pull: bitcoin#34156 Rebased-From: f011e0f
Right now, after migration the last message users see is "migration completed", but the migration isn't actually finished yet. We still need to load the new wallets to ensure consistency, and if that fails, the migration will be rolled back. This can be confusing for users. This change logs the post-migration loading step and if a wallet fails to load and the migration will be rolled back. Github-Pull: bitcoin#34156 Rebased-From: d70b159
Because the default wallet has no name, the watch-only and solvables wallets created during migration end up having no name either. This fixes it by applying the same prefix name we use for the backup file for an unnamed default wallet. Before: watch-only wallet named "_watchonly" After: watch-only wallet named "default_wallet_watchonly" Github-Pull: bitcoin#34156 Rebased-From: 82caa81
…eight Github-Pull: bitcoin#34156 Rebased-From: b7c34d0
|
|
||
| mocked_time = int(time.time()) | ||
| self.master_node.setmocktime(mocked_time) | ||
| assert_raises_rpc_error(-1, "filesystem error: cannot copy file: File exists", self.migrate_and_get_rpc, "") |
There was a problem hiding this comment.
There was a problem hiding this comment.
In c95b112: this should be:
assert_raises_rpc_error(-4, "Failed to create database", self.migrate_and_get_rpc, "")
Already fixing that
Also, should pull cbf0bd3 from #34221.
Remember that this is setting the mock time twice, one outside
migrate_and_get_rpcand another time inside it.
migrate_and_get_rpc in 29.x does not do setmocktime.
There was a problem hiding this comment.
migrate_and_get_rpc in 29.x does not do setmocktime.
👍🏼 .
Refactor a common way to perform the failed migration test that exists for default wallets, and add relative-path wallets and absolute-path wallets. Github-Pull: 34226 Rebased-From: eeaf28d
6fb06b8 to
76cdeb7
Compare
brunoerg
left a comment
There was a problem hiding this comment.
light code review ACK 76cdeb7 + backported the functional tests without the fixes and all of them failed accordingly.
bruno@bruno-MS-Challenger-B850M-PLUS:~/projects/bitcoin((HEAD detached at v29.2))$ ./build/test/functional/tool_wallet.py
...
2026-01-12T18:20:20.949000Z TestFramework (ERROR): Unexpected exception
Traceback (most recent call last):
File "/home/bruno/projects/bitcoin/test/functional/test_framework/test_framework.py", line 135, in main
self.run_test()
File "/home/bruno/projects/bitcoin/./build/test/functional/tool_wallet.py", line 853, in run_test
self.test_dump_createfromdump()
File "/home/bruno/projects/bitcoin/./build/test/functional/tool_wallet.py", line 601, in test_dump_createfromdump
assert self.nodes[0].wallets_path.exists()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionErrorbruno@bruno-MS-Challenger-B850M-PLUS:~/projects/bitcoin((HEAD detached at v29.2))$ ./build/test/functional/wallet_backup.py
...
2026-01-12T18:21:24.258000Z TestFramework (ERROR): Unexpected exception
Traceback (most recent call last):
File "/home/bruno/projects/bitcoin/test/functional/test_framework/util.py", line 160, in try_rpc
fun(*args, **kwds)
File "/home/bruno/projects/bitcoin/test/functional/test_framework/coverage.py", line 50, in __call__
return_val = self.auth_service_proxy_instance.__call__(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bruno/projects/bitcoin/test/functional/test_framework/authproxy.py", line 151, in __call__
raise JSONRPCException(response['error'], status)
test_framework.authproxy.JSONRPCException: Failed to create database path '/tmp/bitcoin_func_test_h2_o1g4s/node3/regtest/wallets/res0'. Database already exists. (-36)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/bruno/projects/bitcoin/test/functional/test_framework/test_framework.py", line 135, in main
self.run_test()
File "/home/bruno/projects/bitcoin/./build/test/functional/wallet_backup.py", line 359, in run_test
self.restore_wallet_existent_name()
File "/home/bruno/projects/bitcoin/./build/test/functional/wallet_backup.py", line 160, in restore_wallet_existent_name
assert_raises_rpc_error(
File "/home/bruno/projects/bitcoin/test/functional/test_framework/util.py", line 151, in assert_raises_rpc_error
assert try_rpc(code, message, fun, *args, **kwds), "No exception raised"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/bruno/projects/bitcoin/test/functional/test_framework/util.py", line 166, in try_rpc
raise AssertionError(
AssertionError: Expected substring not found in error message:
substring: 'Failed to restore wallet. Database file exists in '/tmp/bitcoin_func_test_h2_o1g4s/node3/regtest/wallets/res0/wallet.dat'.'
error message: 'Failed to create database path '/tmp/bitcoin_func_test_h2_o1g4s/node3/regtest/wallets/res0'. Database already exists.'.bruno@bruno-MS-Challenger-B850M-PLUS:~/projects/bitcoin((HEAD detached at v29.2))$ ./build/test/functional/wallet_migration.py
...
2026-01-12T18:23:08.626000Z TestFramework (ERROR): Unexpected exception
Traceback (most recent call last):
File "/home/bruno/projects/bitcoin/test/functional/test_framework/test_framework.py", line 135, in main
self.run_test()
File "/home/bruno/projects/bitcoin/./build/test/functional/wallet_migration.py", line 1873, in run_test
self.test_migration_failure(wallet_name=wallet_name)
File "/home/bruno/projects/bitcoin/./build/test/functional/wallet_migration.py", line 773, in test_migration_failure
assert_raises_rpc_error(
File "/home/bruno/projects/bitcoin/test/functional/test_framework/util.py", line 151, in assert_raises_rpc_error
assert try_rpc(code, message, fun, *args, **kwds), "No exception raised"
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: No exception raised
Done |
6517394 QA: tool_wallet: Check that db.log is deleted with a lone legacy wallet, but not with a shared db environment (Luke Dashjr) 69a6b9b Bugfix: Wallet/Migration: Move backup into wallet directory when migrating from non-directory (Luke Dashjr) cef01d0 Wallet/Migration: Skip moving the backup file back and forth for no reason (Luke Dashjr) 60f5290 Wallet/Migration: If loading the new watchonly or solvables wallet fails, log the correct wallet name in error message (Luke Dashjr) 7475d13 Wallet/bdb: Safely and correctly list files only used by the single wallet (Luke Dashjr) Pull request description: ACKs for top commit: achow101: ACK 6517394 furszy: light ACK 6517394 Tree-SHA512: c10fe00dde512ca78cd6939a748b3875d0b40e9714997aedfd939a1dffdc7eaa2fd1779f3972a34b1c1d9a97d8f1ee1e082c970de15ac0e2ef5d9bbf3dc1d89a
4d21972 wallet: test: Failed migration cleanup (David Gumberg) 77622e0 test: coverage for migration failure when last sync is beyond prune height (furszy) 86eaf71 wallet: migration, fix watch-only and solvables wallets names (furszy) fb4406e wallet: improve post-migration logging (furszy) 75b59e5 test: restorewallet, coverage for existing dirs, unnamed wallet and prune failure (furszy) 1d46624 test: add coverage for unnamed wallet migration failure (furszy) e47af69 wallet: fix unnamed wallet migration failure (furszy) e1e9d71 wallet: RestoreWallet failure, erase only what was created (furszy) b54cdfc wallettool: do not use fs::remove_all in createfromdump cleanup (Ava Chow) 5f07b93 wallet: introduce method to return all db created files (furszy) 0a944e6 refactor: remove sqlite dir path back-and-forth conversion (furszy) Pull request description: Backports #34222 to 28.x ACKs for top commit: davidgumberg: crACK 4d21972 polespinasa: crACK 4d21972 Tree-SHA512: ea872c78e7403f2fe2c7e66dc3215ac01cefadea0b50a6cf2067220eb5138e23c6b4756c49582dc248c102ed8d4b67cde418fb557ea9e608920d0268ac369794
7263b0f wallet: test: Failed migration cleanup (David Gumberg) 5ecf0b9 test: coverage for migration failure when last sync is beyond prune height (furszy) 4d51e10 wallet: migration, fix watch-only and solvables wallets names (furszy) 5840675 wallet: improve post-migration logging (furszy) d99d8b3 test: restorewallet, coverage for existing dirs, unnamed wallet and prune failure (furszy) 2338708 test: add coverage for unnamed wallet migration failure (furszy) 3fda22a wallet: fix unnamed wallet migration failure (furszy) cc0607e wallet: RestoreWallet failure, erase only what was created (furszy) 0a0f0c6 wallettool: do not use fs::remove_all in createfromdump cleanup (Ava Chow) 1259bb8 wallet: introduce method to return all db created files (furszy) 40cbfe8 refactor: remove sqlite dir path back-and-forth conversion (furszy) Pull request description: Backports bitcoin#34222 to 28.x ACKs for top commit: davidgumberg: crACK 7263b0f polespinasa: crACK 7263b0f Tree-SHA512: ea872c78e7403f2fe2c7e66dc3215ac01cefadea0b50a6cf2067220eb5138e23c6b4756c49582dc248c102ed8d4b67cde418fb557ea9e608920d0268ac369794
1020e8e wallet: test: Failed migration cleanup (David Gumberg) 398271d test: coverage for migration failure when last sync is beyond prune height (furszy) 58b6fc1 wallet: migration, fix watch-only and solvables wallets names (furszy) 4ad3790 wallet: improve post-migration logging (furszy) c1a00dc test: restorewallet, coverage for existing dirs, unnamed wallet and prune failure (furszy) 20c3ec4 test: add coverage for unnamed wallet migration failure (furszy) 6bc81fb wallet: fix unnamed wallet migration failure (furszy) 3cb7f9f wallet: RestoreWallet failure, erase only what was created (furszy) c67b0fe wallettool: do not use fs::remove_all in createfromdump cleanup (Ava Chow) 50e0eab wallet: introduce method to return all db created files (furszy) a867f5d refactor: remove sqlite dir path back-and-forth conversion (furszy) Pull request description: Backports: * bitcoin#34215 * bitcoin#34156 * bitcoin#34226 * 2 required commits from bitcoin#31423 Note that this backport is unclean and several changes have to be made to most commits to accommodate BDB and the differences in migration cleanup behavior. ACKs for top commit: furszy: Code review ACK 1020e8e brunoerg: light code review ACK 1020e8e + backported the functional tests without the fixes and all of them failed accordingly. glozow: light review ACK 1020e8e. Tree-SHA512: 432268117783fc9a221d895a6f6601b6a2a5031c76d1443cf804cc1d486b40fcded982409d548acd1c01a13c7b378b840fcc3fbe823d6ba5ffc4ebe017d4e13c
2aa94c7 QA: tool_wallet: Check that db.log is deleted with a lone legacy wallet, but not with a shared db environment (Luke Dashjr) 60bbfb9 Bugfix: Wallet/Migration: Move backup into wallet directory when migrating from non-directory (Luke Dashjr) ba4f810 Wallet/Migration: Skip moving the backup file back and forth for no reason (Luke Dashjr) fe3eca6 Wallet/Migration: If loading the new watchonly or solvables wallet fails, log the correct wallet name in error message (Luke Dashjr) a649a08 Wallet/bdb: Safely and correctly list files only used by the single wallet (Luke Dashjr) Pull request description: ACKs for top commit: achow101: ACK 2aa94c7 furszy: light ACK 2aa94c7 Tree-SHA512: c10fe00dde512ca78cd6939a748b3875d0b40e9714997aedfd939a1dffdc7eaa2fd1779f3972a34b1c1d9a97d8f1ee1e082c970de15ac0e2ef5d9bbf3dc1d89a
Backports:
Note that this backport is unclean and several changes have to be made to most commits to accommodate BDB and the differences in migration cleanup behavior.