database: Major redesign of database package. #380

davecgh · 2015-04-16T21:46:33Z

This pull request contains a complete redesign and rewrite of the database package that approaches things in a vastly different manner than the previous version. This is the first part of several stages that will be needed to ultimately make use of this new package.

Some of the reasons for this were discussed in #255, however a quick summary is as follows:

The previous database could only contain blocks on the main chain and reorgs required deleting the blocks from the database. This made it impossible to store orphans and could make external RPC calls for information about blocks during the middle of a reorg fail.
The previous database interface forced a high level of bitcoin-specific intelligence such as spend tracking into each backend driver.
The aforementioned point led to making it difficult to implement new backend drivers due to the need to repeat a lot of non-trivial logic which is better handled at a higher layer, such as the blockchain package.
The old database stored all blocks in leveldb. This made it extremely inefficient to do things such as lookup headers and individual transactions since the entire block had to be loaded from leveldb (which entails it doing data copies) to get access.
The vast majority of database activity after the initial block download is read activity, however leveldb, as its name implies, is optimized for leveled write performance at the expense of read performance.

In order to address all of these concerns, and others not mentioned, the database interface has been redesigned as follows:

Two main categories of functionality are provided: block storage and metadata storage
All block storage and metadata storage are done via read-only and read-write MVCC transactions with both manual and managed modes
- Support for multiple concurrent readers and a single writer
- Readers use a snapshot and therefore are not blocked by the writer
Some key properties of the block storage and retrieval API:
- It is generic and does NOT contain additional bitcoin logic such spend tracking and block linking
- Provides access to the raw serialized bytes so deserialization is not forced for callers that don't need it
- Support for fetching headers via independent functions which allows implementations to provide significant optimizations
- Ability to efficiently retrieve arbitrary regions of blocks (transactions, scripts, etc)
A rich metadata storage API is provided:
- Key/value with arbitrary data
- Support for buckets and nested buckets
- Bucket iteration through a couple of different mechanisms
- Cursors for efficient and direct key seeking
Supports registration of backend database implementations
Comprehensive test coverage
Provides strong documentation with example usage

This pull request also contains an implementation of the previously discussed interface named ffldb (flat file plus leveldb metadata backend). Here is a quick overview:

Highly optimized for read performance with consistent write performance regardless of database size
All blocks are stored in flat files on the file system
Bulk block region fetching is optimized to perform linear reads which improves performance on spindle disks
Anti-corruption mechanisms:
- Flat files contain full block checksums to quickly an easily detect database corruption without needing to do expensive merkle root calculations
- Metadata checksums
- Open reconciliation
Extensive test coverage:
- Comprehensive blackbox interface testing
- Whitebox testing which uses intimate knowledge to exercise uncommon failure paths such as deleting files out from under the database
- Corruption tests (replacing random data in the files)

In addition, there is a new tool provided under the new database directory named dbtool which provides a few basic commands for testing the database. It is designed around commands, so it could be useful to expand on in the future.

Finally, this commit addresses the following issues:

Adds support for and therefore closes [database] Support multiple (non-best) chains, concurrent reads on different forks. #255
Fixes power outage and corrupted db #199
Fixes Database seems to extremely rarely miss transactions that were added #201
Implements and closes [database] BlockShaMissing should include the missing block hash #256
Obsoletes and closes [database] Remove Db.DropAfterBlockBySha and Db.RollbackClose #257
Closes Very slow block processing #247 once the required chain and btcd modifications are in place
to make use of this new code

There are several things that need to happen before this PR can be merged and ultimately used:

The blockchain package needs to be redesigned to pick up the functionality the new database is losing (spend tracking, chain linking, transaction index) (blockchain: Rework to use new db interface. #491)
The blockchain package needs to be made safe for concurrency (blockchain: Rework to use new db interface. #491)
btcd needs to be updated to use the redesigned blockchain and database packages (blockchain: Rework to use new db interface. #491)
A database cache is needed to increase throughput

davecgh · 2015-04-17T06:47:22Z

Rebased and updated for btcutil.Block.Sha() API change.

jrick · 2015-04-28T14:57:08Z

database2/ffboltdb/blockio.go

+	// and lookup map.
+	//
+	// openBlocksLRU tracks how the open files are refenced by pushing the
+	// least recently used files to end of the list.  When a file needs to


s/end/beginning/ ?

It's right. Most recently used goes to the beginning, least recently used to the end.

However I'll reword it in terms of the most recently used since I can see how that might be worded in a confusing fashion.

davecgh · 2015-11-14T19:59:34Z

Rebased to latest master.

davecgh · 2015-11-28T18:30:49Z

Rebased to latest master.

davecgh · 2015-12-08T09:35:58Z

Rebased to latest master.

davecgh · 2016-02-01T20:40:10Z

OK, so I'm making a push to get this merged within the new couple of days. It has been running for months with no issues.

Calling for final reviews.

davecgh · 2016-02-01T21:39:09Z

Reviewed 29 of 42 files at r1, 4 of 7 files at r2, 1 of 2 files at r4, 5 of 8 files at r5, 5 of 5 files at r6.
Review status: all files reviewed at latest revision, all discussions resolved.

Comments from the review on Reviewable.io

Dirbaio · 2016-02-01T21:39:09Z

LGTM 👍

I've been running with database2 in production for a few months and it works great.

dajohi · 2016-02-03T17:34:55Z

OK from me. Been running for months without issue.

jrick · 2016-02-03T17:35:17Z

ok here too.

This commit contains a complete redesign and rewrite of the database package that approaches things in a vastly different manner than the previous version. This is the first part of several stages that will be needed to ultimately make use of this new package. Some of the reason for this were discussed in btcsuite#255, however a quick summary is as follows: - The previous database could only contain blocks on the main chain and reorgs required deleting the blocks from the database. This made it impossible to store orphans and could make external RPC calls for information about blocks during the middle of a reorg fail. - The previous database interface forced a high level of bitcoin-specific intelligence such as spend tracking into each backend driver. - The aforementioned point led to making it difficult to implement new backend drivers due to the need to repeat a lot of non-trivial logic which is better handled at a higher layer, such as the blockchain package. - The old database stored all blocks in leveldb. This made it extremely inefficient to do things such as lookup headers and individual transactions since the entire block had to be loaded from leveldb (which entails it doing data copies) to get access. In order to address all of these concerns, and others not mentioned, the database interface has been redesigned as follows: - Two main categories of functionality are provided: block storage and metadata storage - All block storage and metadata storage are done via read-only and read-write MVCC transactions with both manual and managed modes - Support for multiple concurrent readers and a single writer - Readers use a snapshot and therefore are not blocked by the writer - Some key properties of the block storage and retrieval API: - It is generic and does NOT contain additional bitcoin logic such spend tracking and block linking - Provides access to the raw serialized bytes so deserialization is not forced for callers that don't need it - Support for fetching headers via independent functions which allows implementations to provide significant optimizations - Ability to efficiently retrieve arbitrary regions of blocks (transactions, scripts, etc) - A rich metadata storage API is provided: - Key/value with arbitrary data - Support for buckets and nested buckets - Bucket iteration through a couple of different mechanisms - Cursors for efficient and direct key seeking - Supports registration of backend database implementations - Comprehensive test coverage - Provides strong documentation with example usage This commit also contains an implementation of the previously discussed interface named ffldb (flat file plus leveldb metadata backend). Here is a quick overview: - Highly optimized for read performance with consistent write performance regardless of database size - All blocks are stored in flat files on the file system - Bulk block region fetching is optimized to perform linear reads which improves performance on spindle disks - Anti-corruption mechanisms: - Flat files contain full block checksums to quickly an easily detect database corruption without needing to do expensive merkle root calculations - Metadata checksums - Open reconciliation - Extensive test coverage: - Comprehensive blackbox interface testing - Whitebox testing which uses intimate knowledge to exercise uncommon failure paths such as deleting files out from under the database - Corruption tests (replacing random data in the files) In addition, this commit also contains a new tool under the new database directory named dbtool which provides a few basic commands for testing the database. It is designed around commands, so it could be useful to expand on in the future. Finally, this commit addresses the following issues: - Adds support for and therefore closes btcsuite#255 - Fixes btcsuite#199 - Fixes btcsuite#201 - Implements and closes btcsuite#256 - Obsoletes and closes btcsuite#257 - Closes btcsuite#247 once the required chain and btcd modifications are in place to make use of this new code

This commit adds a database cache layer to the ffldb database backend so that callers can commit multiple transactions without having to incur the overhead of a disk sync on every new block.

davecgh · 2016-02-03T17:46:15Z

Reviewed 39 of 39 files at r7.
Review status: all files reviewed at latest revision, all discussions resolved.

Comments from the review on Reviewable.io

davecgh force-pushed the database2 branch 4 times, most recently from 7c1708e to 0833b32 Compare April 17, 2015 06:47

davecgh force-pushed the database2 branch from 0833b32 to d0300e7 Compare April 17, 2015 06:52

jrick reviewed Apr 28, 2015
View reviewed changes

dajohi mentioned this pull request Oct 28, 2015

Add gettxoutsetinfo RPC #142

Open

davecgh force-pushed the database2 branch from 6767b42 to 5472031 Compare November 14, 2015 19:54

davecgh force-pushed the database2 branch from 5472031 to a82cb0a Compare November 28, 2015 18:30

davecgh force-pushed the database2 branch from a82cb0a to 7eadb9c Compare December 8, 2015 09:09

davecgh force-pushed the database2 branch from 7eadb9c to e8b23ed Compare December 10, 2015 02:48

davecgh mentioned this pull request Dec 14, 2015

[addrmgr] Update to make use of upcoming database #581

Open

davecgh force-pushed the database2 branch from 4821376 to 1aaa3b4 Compare December 29, 2015 09:29

davecgh force-pushed the database2 branch from 1aaa3b4 to d3b92f2 Compare January 6, 2016 17:03

davecgh force-pushed the database2 branch 2 times, most recently from 0e37dac to 525e996 Compare February 1, 2016 19:11

davecgh force-pushed the database2 branch 3 times, most recently from 34fee5d to 8b2bd5d Compare February 1, 2016 22:11

davecgh added 2 commits February 3, 2016 11:42

database: Implement cache layer.

0b32feb

This commit adds a database cache layer to the ffldb database backend so that callers can commit multiple transactions without having to incur the overhead of a disk sync on every new block.

davecgh force-pushed the database2 branch from 8b2bd5d to 0b32feb Compare February 3, 2016 17:42

conformal-deploy merged commit 0b32feb into btcsuite:master Feb 3, 2016

davecgh deleted the database2 branch February 3, 2016 19:43

davecgh mentioned this pull request Feb 14, 2016

Implement banning based on dynamic ban scores #611

Merged

davecgh mentioned this pull request Mar 16, 2016

database: Major redesign of database package decred/dcrd#91

Merged

3 tasks

MrAlexWeber mentioned this pull request Jan 26, 2018

Pruning support? #1069

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

database: Major redesign of database package. #380

database: Major redesign of database package. #380

davecgh commented Apr 16, 2015

davecgh commented Apr 17, 2015

jrick Apr 28, 2015

davecgh Apr 28, 2015

davecgh Apr 28, 2015

davecgh commented Nov 14, 2015

davecgh commented Nov 28, 2015

davecgh commented Dec 8, 2015

davecgh commented Feb 1, 2016

davecgh commented Feb 1, 2016

Dirbaio commented Feb 1, 2016

dajohi commented Feb 3, 2016

jrick commented Feb 3, 2016

davecgh commented Feb 3, 2016

database: Major redesign of database package. #380

database: Major redesign of database package. #380

Conversation

davecgh commented Apr 16, 2015

davecgh commented Apr 17, 2015

jrick Apr 28, 2015

Choose a reason for hiding this comment

davecgh Apr 28, 2015

Choose a reason for hiding this comment

davecgh Apr 28, 2015

Choose a reason for hiding this comment

davecgh commented Nov 14, 2015

davecgh commented Nov 28, 2015

davecgh commented Dec 8, 2015

davecgh commented Feb 1, 2016

davecgh commented Feb 1, 2016

Dirbaio commented Feb 1, 2016

dajohi commented Feb 3, 2016

jrick commented Feb 3, 2016

davecgh commented Feb 3, 2016