Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvement in bbolt/walletdb speed #111

Closed
wants to merge 4 commits into from
Closed

Improvement in bbolt/walletdb speed #111

wants to merge 4 commits into from

Conversation

johnsonjh
Copy link

@johnsonjh johnsonjh commented Oct 29, 2020

  • First, we take the size of the walletdb and set our initial mmap size to (current size * 2.5) - this will reduces the amount of re-mapping that needs to be done at doubling and also reduces memory fragmentation - actual result is less memory usage due to reduced fragmentation, though higher usage initially.

  • Also enable the NoFreelistSync option to not sync the freelist to disk, which has a massive improvement on write performance.

  • Combined, tests on a Gridnode show a ~250% gain.

  • Updated Neutrino and wallettool for the new API, as well as relevant documentation.

@johnsonjh
Copy link
Author

Rationale for passing the opts is that I plan to use additional bbolt options for tuning via AllocSize and DefaultAllocSize (once I've done enough testing to determine a the correct number), and will be giving you in the next day or so a PR for a new command-line option to pktwallet that enables StrictMode checking.

Because the BBoltDB bdb backend is used for more than just the walletdb, different options appropriate for different use cases (wallet vs. NeutrinoDB, etc), so a one-size-fits-all solution setting static options in just one place would likely be counter-productive.

@johnsonjh
Copy link
Author

Also, now I have a real benchmark for you for bboltdb recovery time - the added start-up overhead of rebuilding freelist on a GNC is approximately 40 seconds with a 1GB walletdb file. The same test on my i7 desktop is about 3 seconds.

@cjdelisle
Copy link
Member

  1. I want one everything using the same db configuration, even if it's slower - because every different way of configuring the db is a chance for bugs.
  2. I want to get rid of this func Create(dbType string, args ...interface{}) (DB, er.R) { which is an accident waiting to happen, that func should use a normal signature.

@johnsonjh
Copy link
Author

johnsonjh commented Nov 5, 2020

1. I want one everything using the same db configuration, even if it's slower - because every different way of configuring the db is a chance for bugs.

Unfortunately, the configuration that works for the wallet database is not going to be appropriate for the neutrino database, and you won't be able to easily support 32-bit platforms - and, to be clear, that is broken right now, without this patch, due to the way Bolt/BBolt will default to unclamped direct memory mapping and the default expansion configuration, which will cause the database to fail to mmap once either the walletdb (or neutrinodb) grows to 4GB (or both grow to 2GB) on 32-bit platforms. These sizes are commonly already seen in production for many Gridnodes, where the median database size is just about ~2GB. This is directly noted as a limitation and explicitly mentioned in the project's README at https://github.com/etcd-io/bbolt#caveats--limitations and boltdb/bolt#308 (comment) - Unfortunately, using DB compaction as a way to keep these size factors under control would actually cause conditions of massive I/O requirements from the constant recopying and re-mapping of the database when the compacted version is used, so greatly increasing the chances of database corruption, so using different options is the only solution. Or using a different database.

So, unless we decide to throw out BBoltDB/BoltDB completely and change to a different database that doesn't use the direct memory mapping, or we switch to using different databases specifically tuned to each use case (which seems like an even larger chance for bugs), then using different options is absolutely a necessity, even more if we plan to actually support Gridnodes having the hard shutdown path in the future, because the options needed for expansion and initial mmap size change dynamically depending on the current size of the on-disk database. This is actually the way etcd (also bbolt's upstream) handles it as well.

I'm actually totally with you here - I want one configuration for everything too, but it's just not going to work if we keep the databases we're currently using. I'll mention as I have before to you that this is the same route currently being taken by lnd and upstream btcd as well. Perhaps we could find one set of options that mostly work for most people for the regular distribution, and I'd maintain a separate fork for GF use? I'm not sure that's a situation you'd be happy with either, but it could be an option.

2. I want to get rid of this `func Create(dbType string, args ...interface{}) (DB, er.R) {` which is an accident waiting to happen, that func should use a normal signature.

From everything I've looked at, that seems to be the normal way of doing it, and that pattern is essentially universal, see https://golang.org/pkg/database/sql/ etc. It's also done this way upstream, as well as in all the related projects. Even Decred has the same interface - amazing they haven't rewritten it yet! While I guess we could do a rewrite here, we'd be doing things completely differently than everyone else, it would be a pretty large divergence from upstream btcd and all related projects.

@johnsonjh
Copy link
Author

johnsonjh commented Nov 5, 2020

Also, I've already investigated switching to storing data the same way as Core which uses Sleepycat BerkeleyDB as their on-disk format for wallet data (and actually for a few others databases they store as well), but it just isn't going to happen with Go, unless we either do a lot of work (and Oracle will probably sue us, even without any grounds to win, because that's what they do) or put ourselves at the mercy of Oracle's licensing and giant C++ codebase.

Core is using v4.8 of BDB with some simple modifications - the main reason they don't upgrade to newer BDB is because the database was purchased by Oracle, who changed the licensing terms (and if they changed them once, they might change them again). Going back through the mailing lists you can see there was (often heated) discussion of upgrading the on-disk DB (and thus BDB) for quite a while (the main concerns were new bugs and wallet file compatability between systems), but the Oracle purchase essentially eliminated that option completely.

You can certainly build the Satoshi code with any current Oracle BDB, but upon opening the database, it's on-disk format is automatically upgraded by the Oracle BDB code, and becomes incompatible with any previous versions, so you can never downgrade - there is no path back to the more freely licensed releases. Oracle becomes the upstream and only path for future bugfixes. Also, Oracle changes the on-disk format regularly (for awhile, upon every release.) I'd link the changelogs for you, but you need an Oracle account to even look at them.

Oracle does make Go bindings for BDB available, but adding BDB would require adding a large non-Go codebase to pktd (C++), it would massively complicate the build process, and the current Go-bindings are not "freely available" in the OSI sense - you need to register an account with Oracle and agree to their license agreement to get them, which of course states they can change the licensing at any time. They also apparently bullied the developer of the alternative open Go bindings (https://github.com/jsimonetti/berkeleydb) out of the game, so those are unmaintained for nearly 5 years now.

@cjdelisle
Copy link
Member

I will accept this if you change the interface so that it is a compile time error if arguments are passed wrong.

@johnsonjh
Copy link
Author

I'll look into what's required to do so.

@johnsonjh
Copy link
Author

OK, seems Go does have (a few!) ways to do compile time assertions, so I'll get this fixed up.

@johnsonjh
Copy link
Author

johnsonjh commented Nov 17, 2020

@cjdelisle Other than the requested change (and an obligatory rebase), would you like me to add a clamp on the initial mmap size for 32-bit systems? While this will not actually "solve" the issue of the databases eventually becoming too big for 32-bit systems, it will greatly delay the inevitable OOM condition.

@cjdelisle Edit: Besides a clamp on initial mmap size, I can import my database compaction code and have it run before the databases open on 32-bit systems - this will hamper performance (both start-up time due to the wait while the database is compacted, and run-time, as bbolt would then need to copy the database over and over to grow it along with the subsequent remappings) but would likely allow a 32-bit system to run for quite some time before any OOM condition.

This can be done in a second, dependent PR, if you'd like, since it would add a lot of code. Having the compaction code in included might be useful, because it could be triggered by a command-line argument and thus available to everyone, not just 32-bit systems.

Also, if I were to take this approach, some the performance issues could be mitigated by not compacting databases smaller than a certain size, say, 512MB or 1GB, and writing out a last compaction time, so compaction would not occur on every start-up, but only after some duration, such as 24 hours, has elapsed.

Edit 2: I will also update the documentation to advise of the limitations of 32-bit systems, along with some suggestions, such as not running pktd and pktwalet on the same (32-bit) system - in the end, though, all this is sort of a temporary measure, and 32-bit systems will eventually become unsupportable, but the above changes will buy significant time.

@cjdelisle
Copy link
Member

  1. Fix the merge conflicts
  2. Change the function signature so that if you pass it garbage, it will compile time fail -- Just do the obvious thing and put the bbolt.Options in the function sig, it's not like we're going to change databases without knowing about it.
  3. Add documentation: 32 bit systems unsupported

@johnsonjh
Copy link
Author

johnsonjh commented Nov 18, 2020

@cjdelisle Will do, even though I was learning how to do Go compile-time assertions! 🥇

  1. Change the function signature so that if you pass it garbage, it will compile time fail -- Just do the obvious thing and put the bbolt.Options in the function sig, it's not like we're going to change databases without knowing about it.

Ah, see, this all sort of stems from me assuming pktwallet would be the same as the pktd code, which does let the user change the database out from under us:

      --dbtype=               Database backend to use for the Block Chain (default: ffldb)

The wallet, thankfully, has no equivalent function.

We should probably remove that from pktd, seems like a landmine waiting to be stepped on.

@cjdelisle
Copy link
Member

Yes, drop it

* Bypass most abstractions
* Use standard function definitions
* Update GoDocs, package README's etc.
@johnsonjh
Copy link
Author

johnsonjh commented Nov 20, 2020

@cjdelisle

@johnsonjh
Copy link
Author

johnsonjh commented Nov 21, 2020

Also to pre-emptively answer a few questions that might come up:

  1. Bypassing the two (actually, three) levels of abstraction in the database driver path has exposed some oddities of the interfaces, which are now used directly. For example, the ffldb interface is (Path, ChainParams.Net, Create) and the bdb interface is (Path, Create, Options). I was considering normalizing them, and also supporting options for ffldb, but didn't want to overcomplicate this PR.

2. --memdb Use the in-memory database. Don't do this unless you know what you're doing. ~~
~~ - This this was added in by me for future use by the full control test harness - However, I didn't get to implementing it - I can remove this option as it doesn't work with a with a quick additional commit, it honestly just slipped by me.
Edit: Corrected.

3. Looks like I missed properly updating the pktwallet/walletdb/bdb/README.md file to document the new interface - I will do so in a new commit. Edit: Corrected

4. In pktwallet/walletdb/bdb/db.go:390 const table added, but currently unused - Intentional, as I need to do more testing with these settings before "going live" with them; I hope it's OK to have them there, for the moment, so I don't forget about them. The intention here is to further harden the hard shutdown path by disabling the bbolt batching system - the way batches work in bbolt makes it hard to know when it's safe to shutdown because the writes can persist even after the database has been closed and sync has been called. Edit: Corrected (by commenting)

  1. A small number of unrelated changes are included, almost all syntax nitpicks, so, nothing significant, and nothing that affects either logic or any of the other pending PRs w.r.t. conflicts, etc.

6. 64-bit requirement still needs to be added to the primary README.md. Edit: Corrected.

7. Full regression not done, but all relevant tests have been updated and pass. Also, have not yet done extensive runtime checks with the race detector. Edit: Mostly complete - still have additional race detection enabled runtime testing pending. Edit: No new race conditions found.

* Remove option and documentation for the
  currently unimplemented memdb.
* Update documentation to relect actual API.
* Comment and document some TODO items.
* Updat the primary README.md (in PR #220).
@johnsonjh
Copy link
Author

@cjdelisle Updated with corrections for most items noted at #111 (comment)

@johnsonjh
Copy link
Author

  • Verified clean merge against current develop as of Thu Nov 26 02:01:32 PM EST 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants