-
Notifications
You must be signed in to change notification settings - Fork 38.1k
wallet: build: Don't write BDB logs and bump BDB to 5.3 #18844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
We don't use BDB logs so it's pointless to keep them around. These log files are also what has locked us into BDB 4.8. We can just not write them by settting DbEnv flag DB_LOG_IN_MEMORY and Db flag DB_TXN_NOT_DURABLE. The creation of the db log dir and other log settings have been removed. Removing/moving the log is still kept around to remove/move the log files produced previously.
Concept NACK: This breaks backward compatibility, which is why we've always stuck to 4.8.
|
This is maybe addressed in the PR description, which says that database format hasn't actually changes just log format. I'm confused about motivation for this PR though. The goal of this is to make database writes more reliable? Have we seen cases where data has been lost and we think this change would fix? What exactly is the case where this would happen? Or is the goal something different like making writes more efficient? |
It doesn't and testing that is incredibly easy. You don't even need the changes in this PR to find that out for yourself. The log changes here aren't strictly necessary for BDB 5.3 The wallet.dat files themselves are portable between these versions. From the documentation that I've found, we should be able to use the same wallet.dat files down to BDB 4.0, and up to the latest version of BDB
The goal is to remove the strict requirement for BDB 4.8.
It actually does the opposite currently, which is why I added the section at the bottom about alternatives. When I wrote this, I assumed that flushes were occurring immediately after we were done writing a record, but this appears to not be the case. There is a bit of a delay which is mostly not human noticeable but this has an effect on a test like |
Mind pointing to the docs? |
Oh, so disabling the logs fixes the compatibility problem? :o |
My recollection is that the log files have no compatibility guarantees, not even backward compatibility for minor versions (i.e. you can't use a database+log from 4.7 in 4.8) - which is why we flush logs on exit. The database itself is backward compatible, but not forward compatible (you can use db files from 4.7 in 4.8 but not the other way around). I'm unsure about the goal here. There used to be frequent issues with wallets being unable to be opened, and we ended up making several frequent changes to the BDB settings we open envs/databases with. At some point those issues stopped, so I suspect we found a fairly stable configuration. I wouldn't touch these things without going back through history to see what has been tried before. |
There are some changelogs on the previous releases page: https://www.oracle.com/database/technologies/related/berkeleydb-release-history.html. There is more documentation in the source downloads themselves.
That is what I have found in my experimentation and dive through their source code.
This isn't true for 5.3 -> 4.8, I've tested this. A quick look through the code for 4.7 and 4.8 indicate that it should be compatible too. I'm pretty sure the incompatibility there is with the logs. |
Back in 2012 I had hoped we'd switch away from BerkeleyDB before needing to bump it ever. I'm still not convinced this is a particular thing that needs to be changed. I'm kind of scared of these kind of foundational changes to the wallet format and all the unknown issues it might cause downstream. So concept NACK, sorry. |
I've gone through their older versions to check compatibility. The only changes that were made were the depends package for BDB and
Versions 6.0 introduces a new database file version which is not backwards compatible. But it is possible to take an existing wallet.dat, use it with 6.0+, and then go back to the older BDB versions. BDB won't change a database's format without an explicit upgrade. Versions 4.6 and lower have API differences. 4.6.21 had a minimal change and after applying it, the wallet.dat file still worked. 4.5.20 and lower have further API differences that I didn't feel like going through to check the database file compatibility. Given this information, I think that something reasonable to do is to remove |
Concept ACK. I think everyone compiling from source is using bdb 5.3 and I haven't heard of any issues with that. Please include tests to check that bitcoin core master (bdb5.3) can load bitcoin core 0.20.0 (bdb4.8) wallets (and vice-versa). If there are concerns that removing the |
What is the basis for this assumption? I don't know how people are building on Fedora/etc, but Gentoo still has bdb 4.8, and there are Ubuntu PPAs with it. |
I can't remember the last time I built without --with-incompatible-bdb. |
The backwards compatibility test creates a wallet on master (or the PR) and then opens it with an older release. Those older releases are downloaded binaries which ship with BDB 4.8. So as long as this PR doesn't break that test, the change is probably backwards compatible. Narrator: it does break the test: https://travis-ci.org/github/bitcoin/bitcoin/jobs/682087327#L9993 |
I installed BDB 4.8 so I never build with
Yes, this particular change is not backwards compatible, so I will be closing it. However, the particular thing it is failing on is the log files as the |
Berkeley DB uses a database transaction log to ensure Durability (the D of ACID). But we already have a couple of options enabled that remove the guarantee of Durability anyways (we enable DB_TXN_WRITE_NOSYNC). So the log already isn't as useful to us as it could be. Furthermore, we don't really make use of the BDB's transaction feature. Almost every single read and write that we do is itself atomic and a single transaction. We do each read and write of a record as a single transaction. Flushing is always done when a
WalletBatch
is destroyed, and we have many single useWalletBatch
objects.We do, in a few places, batch transactions so even after a transaction is committed, it may not be flushed to the database. In those instances, the log is useful as it does recover whatever wasn't written. But whatever didn't get written also doesn't matter. If a new transaction was being written, the blockchain can be rescanned. If new metadata was being written, the user can just re-enter it. For newly generated keys, those keys didn't see any use if their records didn't get flushed to the db (that flushing always happens before the key generation function returns). So it's fine to lose those keys, and for HD wallets, they will be regenerated anyways, so nothing is lost.
The single exception to the single record single transaction that we usually do is with encrypting existing private keys. This encryption is done in a single transaction so that it is atomic, and that is many records being written. But even with that, the transaction log isn't useful. We abort before anything is flushed to disk so the wallet state is not inconsistent and all of the unencrypted private keys are still there. The encryption can then be redone. So the log is not useful there.
Additionally, we are already trying to get rid of the transaction log as much as possible. We turn on the auto remove (DB_LOG_AUTO_REMOVE) and the logs are always cleaned up on a clean shutdown. As mentioned earlier, the transaction log already has a flag that doesn't always flush the log anyways. So I don't think that just completely avoiding transaction logs entirely is unreasonable.
Logs are disabled by using the db flag DB_TXN_NOT_DURABLE to prevent logs being written for a db, and using the db env flag DB_LOG_IN_MEMORY to keep all db environment logs in memory.
Not having the BDB logs additionally allows us to remove the BDB 4.8 requirement. This requirement was because the Oracle keeps changing the log file format. The database file format itself has not changed in a long time. The log file format difference would supposedly make downgrades more difficult because the log files are not downgradable. However in my testing, this was not actually the case as the log files would be moved out of the way if the database could not be opened, so downgrading wasn't that much of a problem, especially with clean shutdowns. Not having the transaction log be written at all makes this not a problem entirely as there aren't any log files to get in the way. So we can now allow for other BDB versions.
Bumping to BDB 5.3 is done only because that's the version that is provided by many (most?) distros. With this change,
--with-incompatible-bdb
is no longer needed and we can simply build and installation instructions to just "install bdb from your package manager".This change might be contentions, so here are some alternatives that we might want to discuss: