Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor indexers #424

Closed
wants to merge 7 commits into from
Closed

Conversation

tuxcanfly
Copy link
Member

@tuxcanfly tuxcanfly commented Mar 1, 2018

This PR refactors the indexers into a lib with it's own db.

  • Handle sync, tip catch up
  • Add migration to remove existing indices
  • Cleanup old indexing code

Refs #422

@tuxcanfly
Copy link
Member Author

Moved the indexer itself to:

https://github.com/tuxcanfly/bindex

This PR will be limited to the plugin integration, migration etc.

@bucko13 bucko13 added enhancement Improving a current feature and removed WIP labels Mar 13, 2018
Copy link
Contributor

@bucko13 bucko13 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Nice to have the plugin separated out. I guess this is probably similar to the approach we'll want to do at some point with bwallet.

const chain = await this.chain.getMetaByAddress(addrs);
return chain.concat(mempool);
return mempool.concat(chain);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why reverse this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case would occur if indexes are disabled, so I think it should just return mempool actually. Will check that.

lib/node/node.js Outdated
@@ -81,6 +81,11 @@ class Node extends EventEmitter {
if (config.has('logger'))
logger = config.obj('logger');

const plugins = this.config.array('plugins', []);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean we add bindex by default if indexing is turned on? Does it make a difference that it's in a separate module now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By implicitly enabling bindex plugin we can ease the migration. No config files need be changed.

@@ -968,7 +968,7 @@ class RPC extends RPCBase {
if (hash) {
block = await this.chain.getBlock(hash);
} else if (this.chain.options.indexTX) {
const tx = await this.chain.getMeta(last);
const tx = await this.node.getMeta(last);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is the difference with having indexing as a plugin on the node rather than built into the chain huh? 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup :) Since indexing happens outside the chain.

@@ -0,0 +1,82 @@
'use strict';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have an explicit note somewhere that this basically just removes your old db and does a fresh sync with the new indexing?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah we can add that to the CHANGELOG

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to CHANGELOG

@tuxcanfly
Copy link
Member Author

Thinking this through, I think the indexing interface and nested plugins add unnecessary complexity. Also, splitting out tx, addr as plugins etc would make it hard to query them back, since we would need to know which keys the plugins are writing to.

To keep things simple, I propose that the core indexers be a part of the indexing plugin directly. Alternate implementations can fork the index plugin for specific use cases (utxo set, colored coins) etc. Thanks @bucko13 for the discussion.

@tuxcanfly tuxcanfly changed the title refactor indexers into plugin based system refactor indexers into a plugin Mar 20, 2018
@@ -147,6 +144,14 @@ class FullNode extends Node {
noAuth: this.config.bool('no-auth')
});

// Index needs access to the node.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed for some of the events?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, we need it to listen to the chain events. This specific comment can be improved though, will update it.

@tuxcanfly
Copy link
Member Author

There's an issue with the way re-orgs are handled. Working to fix that and add more tests.

@chjj chjj changed the base branch from wallet-rewrite to master March 30, 2018 05:23
@chjj chjj changed the base branch from master to wallet-rewrite March 30, 2018 05:25
@chjj
Copy link
Member

chjj commented Mar 30, 2018

Needs rebase on top of master.

@tuxcanfly tuxcanfly changed the base branch from wallet-rewrite to master March 30, 2018 20:25
@tuxcanfly
Copy link
Member Author

@chjj I realized I need to store all the blockhashes similar to wallet, because a re-org will leave my client out of sync and I wouldn't know where to rescan from. I might need to borrow some code from the wallet, specifically WalletDB. Is that fine or do you think this part should be moved to a common class? FWIW @nodar-chkuaselidze will need it too for bmultisig.

@nodech
Copy link
Member

nodech commented Mar 31, 2018

My case is simpler, because bmultisig is plugin for wallet I can watch wallet events and check coins in db on load. Wallet will be handling that for me.

@tuxcanfly
Copy link
Member Author

Making bindex a part of lib to prevent dependency issues. fullnode's dependency on bindex for RPC calls etc makes it difficult to manage as an external module. Most of the chain syncing ideas were inspired from wallet so keeping it in lib is similar in that regard.

@tuxcanfly tuxcanfly force-pushed the index-plugin branch 5 times, most recently from 0867be8 to 08b38b3 Compare April 9, 2018 22:19
@tuxcanfly
Copy link
Member Author

Needs some updates to handle different databases for different indexers.

*/

async getHashesByAddress(addrs) {
const set = new BufferSet();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You probably could use an array here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a side note, if there are a lot of txids for an address this will likely have performance issues. The largest number of transactions for an address right now is over 3 million.

@braydonf
Copy link
Contributor

braydonf commented Aug 23, 2018

Worth considering as this code, and database, is being changed. There is an issue with the existing txindex. The tx index is keeping an entire copy of the transaction https://github.com/bcoin-org/bcoin/blob/master/lib/primitives/txmeta.js#L239 so there is a duplicate of the transaction on disk.

I've documented this more at #585

Edit: The existing issues with txindex and addrindex can be addressed in another PR, this PR I think is mainly around having those indexes be separated into another database (one for each index).

module indexer introduces a extensible architecture for indexing the
chain. It provides a base class which handles syncing with the chain,
handling re-orgs, interruptions, dynamic toggling, etc. TXIndexer
and AddrIndexer are provided for indexing transactions and addresses,
using the same flags as before i.e --index-tx and --index-address.
Indexes are stored in a different database and can be maintained
independently of the chain.
also rebase on top of master, fix conflicts
@tuxcanfly
Copy link
Member Author

Similar to #532 need to decide how we want to handle re-index in case of spv, pruned nodes. We might want to ask the user to reset the chain.

@tuxcanfly
Copy link
Member Author

Added an error when re-indexing when pruned, the only case where it's not possible to re-index.

@braydonf
Copy link
Contributor

braydonf commented Dec 10, 2018

Indexing, in this case, I think really only applies to full nodes, not pruned or spv. Aside from anything that can be indexed headers, or select transactions such as the wallet. It's currently technically possible for the txindex to be enabled on a pruned node, as the transaction data is duplicated, but this should likely change to reduce total disk space usage (see #585).

@tuxcanfly
Copy link
Member Author

Agreed, but we already support indexing when pruned, so disabling this would probably need a deprecation notice. Re-indexing was never supported, but with this PR, it is supported for full nodes, but not for pruned nodes. The error applies to the latter case.

@braydonf
Copy link
Contributor

braydonf commented Mar 27, 2019

Some continued work on this is at an indexer branch that is being merged with changes from file block store in #703 and some of the additional addrindex fixes.

@tuxcanfly
Copy link
Member Author

Closing this in favor of #758 reworked to support the new flat file backend.

@tuxcanfly tuxcanfly closed this Apr 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Improving a current feature indexer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants