-
Notifications
You must be signed in to change notification settings - Fork 35.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pull request idea] addressindex, spentindex, timestampindex (Bitcore patches) #10370
Conversation
Adds new bitcoin.conf configuration options for three new indexes: -addressindex=1 -spentindex=1 -timestampindex=1 The addressindex records all changes to an address for retrieving txids, balances and unspent outputs for addresses. Changes are stored and sorted in block order. Both p2sh and p2pkh address types are supported. The index records two sets of key/value pairs. The first records all activity and is useful for viewing transaction history and all changes. The second is specifically for retrieving unspent outputs by address, and is smaller as values are removed once they are spent. The spentindex has multiple purposes and brings closer together inputs and outputs of transactions. The main purpose is to efficiently determine the address and amount of an input's previous output. The second purpose is to be able to determine which input spent an output. The timestampindex keeps track of timestamps with block hashes and is useful for searching blocks by date instead of by height. This is useful for a block explorer that will give search options by date. The index uses logical time correction to make sure that the results are sorted in block order. The logical time of a block is actual timestamp of the block, unless it is less than (earlier) the previous block's logical time, and in that case it is one second greater than the previous block's logical time. Includes logical time fix by Chethan Krishna Conflicts: src/main.cpp src/main.h src/txdb.cpp src/txdb.h src/txmempool.h
Adds new rpc commands that use the address, spent and timestamp indexes, including the new commands: - getblockdeltas - getblockhashes - getaddressmempool - getaddressutxos - getaddressdeltas - getaddressbalance - getaddresstxids - getspentinfo And modifications to the command: - getrawtransaction Conflicts: src/rpc/blockchain.cpp src/rpc/client.cpp src/rpc/misc.cpp src/rpc/rawtransaction.cpp src/rpc/server.cpp src/rpcserver.h
Tests the functionality of the indexes as well as the rpc commands
There was a previous assumption that blockindex would be quite small. With addressindex and spentindex enabled the blockindex is much larger and the amount of cache allocated for it should also increase. Furthermore, enabling compression should decrease the amount of disk space required and less data to write/read. The default leveldb max_open_files is set to 1000, for the blockindex the default is set to 1000 with compression. The 64 value that is current is kept for the utxo database and does not enable compression. Two additional options are added here to be able to configure the values for leveldb and the block index: - `-dbmaxopenfiles` A number of files for leveldb to keep open - `-dbcompression` Boolean 0 or 1 to enable snappy leveldb compression Conflicts: src/dbwrapper.cpp src/init.cpp
The tests seem to be failing on some rebased code. But as I said, I added this PR as a general discussion request. If there is an interest, I will clean this up. |
Having an independent address-index available somewhere may be handy and in general a great thing. Though, I'm not sure if we want to add this to the Bitcoin-Core repository. It already contains (too) much. Wallet, utilities, txindex, UI, p2p, consensus. I would prefer to keep the address-index (and all other sorts of indexes) in other repositories. If there are missing ZMQ features (I guess there is still an open PR for mempool ZMQ notifications), we should rather add those instead of the indexes. But I'm open for other opinions. |
I am not sure about the exact reasoning (cc @braydonf again), but older versions of Bitcore (< 3) had something similar - unpatched bitcoin core with additional indexes in node.js (see https://github.com/bitpay/bitcore-node/tree/v2.1.1/lib/services ) - but in Bitcore 3, they instead moved to including the address index in the bitcoin core itself. Perhaps it is easier to maintain one database over 2 separate databases? (I am guessing here) |
@Runn1ng have you looked at https://github.com/bitcoinjs/indexd? |
@dcousens thanks for noticing me! I will definitely look at that. |
There was a significant performance gain by building the indexes at the same time as the txindex, since each block only needed to be read and deserialized once. |
@braydonf at ~400ms per 1MB block... that really shouldn't be an issue - and the above is Javascript. IMHO I don't think |
In the past, we had implemented indexes in several ways, as mentioned, and found this to be the best option. I do think additional indexes, if needed, should be implemented within a full/spv node, the closeness reduces the overall complexity and duplication of disk and CPU usage. It otherwise ends up being what is effectively two nodes next to each other. The point that I realized this was when it was necessary to keep what is effectively another UTXO set for determining the address for inputs. As far as the utility of the indexes, and if they should be included in Bitcoin Core: I'd like to see the ability for an address index to be pruned (not included in this current implementation). An address index of the UTXO set and memory pool only, for example, may enable a pruned Bitcoin QT wallet to not need to rescan (and download) all blocks when adding new keys to a wallet. This could support many different hardware/offline wallets, with addresses and keys that the full node may not be aware. And in a way that doesn't depend on remote servers. This could also be useful for accepting bitcoin payments in an environment that the keys may change frequently e.g. on a web server, while also reducing the setup costs. As far as specific implementation details here, there may be some work here still: The indexes may ideally be implemented with a B/B+ tree in C (e.g. LMDB, WiredTiger, and etc.), the LSM tree of LevelDB is useful for smaller write heavy sets of data that are not queried often. The write amplification that happens in block compaction, especially when the database reaches 100+ GB can become problematic. This implementation of indexes is impacted by this issue, however it's mostly revolved by running on SSD and could be helped by running manual compaction (also not implemented here). That same issue may eventually affect the UTXO database as it grows in size. |
If you are using a pruned or SPV node, you really shouldn't be suffering from disk duplication problems.
Personally, I would only concept ACK this if it was completely contained to As far as I'm concerned, I don't see how there could be any significant IO inequal to that as if it was inlined within
I agree that the lack of convenient setup/configuration for these services is not great, but that doesn't mean we have to merge it all into a single binary.
I don't see how an inline index would alleviate this. |
Allowing Bitcore-Wallet-Service to run on top of a vanilla bitcoind + bitcore (no patched binaries needed during install) would be extremely useful. |
I just want to say that a UTXO database index in bitcoin core is a very good idea in my opinion. Keep in mind also there is BIP131, which requires a index on the UTXO set to be implemented. If BIP131 gets implemented, the UTXO database no longer becomes optional (as it will be required to validate new transactions), but since the UTXO index is relatively tiny compared to the In the future, I see the wallet features and UI elements removed from bitcoin core, and backend stuff like indexes taking it's place.My opinion is we should go hog wild and add as many (optional) indexes as conceivable. |
As said in #9806 I'm not opposed to an additional optional index similar to -txindex. For example "-scriptpubkeyindex". The minimum necessary for an explorer I think would be -scriptpubkeyindex plus a call searchscriptpukey that takes the hex of a scriptpubkey and returns a list of outpoints (perhaps including the block hash, see #10275 ). But for the block explorer use case I assume no pruning and serving spent transactions too, which will result in a big index and code that I'm sure some people won't be very happy to help maintain. Even if that's the case, I'm interested in this use case and I may be interested in rebasing the same resulting branch from major version to major version. Starting with a branch that only indexes the unspent scriptpupkeys as @braydonf suggests may be much more acceptable (and whether or not the explorer part is acceptable, the diff to maintain will be much smaller if that gets accepted). Not only because the index would be much smaller and because it doesn't conflict with pruning in any way, but also because the use case of removing the need for rescan is much more compelling.
Sure, the explorer and norescan are different use cases.
Pruned or not, you don't need to trust your data source if you are already a full node. I think the SPV use case is just adding confusion, the utxo-only index can be very useful without taking SPV nodes into account at all.
Absolutely, adding the index doesn't mean we should also add every potential call that would make use of it. Even if that would be more convenient, marginal use cases can perhaps go through less direct routes like implementing some things on their side or calling the rpc twice instead of once to get some data they want, which is may not be a great penalty in performance depending on the case. |
I am closing this PR. We are running bitcoin with these custom patches on our servers and while it is working so far, it is not very maintainable long-term and the leveldb performance is subpar for the addressindex. External index is the only solution. |
We (SatoshiLabs) use bitcore for our backend; and bitcore depends on a fork of bitcoin-core, that adds additional indexes and some other functionality.
Since bitpay stopped updating the fork to newest bitcoin, effectively making bitcore unable to see segwit, we keep rebasing the code to the latest releases. So I thought it wouldn't be a bad idea to try to actually put it into upstream :)
What the code does is adding additional indexes into bitcoin-core that enable to run full block explorer, like insight, and also adds some RPC calls for those indexes. I think it would be beneficial to the ecosystem at large to have these RPC calls/indexes available in upstream.
Most of the work has been done by @braydonf ; I just rebased the code (first to 0.14.* here - https://github.com/satoshilabs/bitcoin - and on master in this PR) and tested it with segwit. (The original, outdated bitcore code is at https://github.com/bitpay/bitcoin )
This is not meant as a definitive PR - it is too big and needs to be cleaned up and broken into smaller parts (it's obvious it has been rebased several times) - but more as a request for discussion; I want to know if I should start spending time with cleaning the commits up etc. or if you don't want this functionality in the bitcoin-core at all.
The added RPCs are:
These calls are sufficient to build block explorer.