New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
indexers: asynchronous index updates. #2219
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you expanding a bit on how this fits into the bigger picture? It really doesn't make sense to me, because an index will need to track the current tip, but that needs to be done by hash, not height. A height is not unique and thus does not provide enough information to tell you how to recover at startup.
I didn't factor in recover on startup via the tip height, that's why I'm tracking the tip height here and not the hash, will correct. My plan is to update the other optional indexes to track the tip and update the index manager to asynchronously receive updates for processing by enabled indexes. |
I'll wait a bit until you get further before making a final judgement, but, my gut feeling is that the approach of trying to make the manager asynchronous with a message channel for connect/disconnect is not going to work out super well. Without going all of the various reasons why, I think it's sufficient to consider that the primary issue is likely to be that it is still relying on being able to send connect/disconnect messages from within the database transaction and, even though it does the indexing of that particular block asynchronous, it will just end up blocking on the channel send when the next block is processed if the previous index operation is ongoing. More generally, the existing infrastructure is designed specifically for synchronous indexing and under the assumption that it all happens under an atomic database transaction. The entire goal #1470 is to make that no longer required, which means just making the existing infrastructure run in a goroutine isn't likely to work well, since it will almost positively lead to needing a bunch of less than ideal hacks and can kicking to make it work instead of actually having a solid asynchronous design. Further, there are indexes which are required to be in sync for proper consensus operation, such as cfilterv2, so you can't really just make them all async. |
Noted, thanks for taking a look. Will rethink my approach and tinker some more. |
617e5b1
to
107811f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started to review this and noticed something that I figured I'd push a review for right away so you can fix it.
c085560
to
e092918
Compare
3fae635
to
b0733cc
Compare
Coverages for updated rpc tests. ❯ go tool cover -func=cov.out | grep -E "handleExistsAddress"
github.com/decred/dcrd/internal/rpcserver/rpcserver.go:1479: handleExistsAddress 100.0%
❯ go tool cover -func=cov.out | grep -E "handleExistsAddresses"
github.com/decred/dcrd/internal/rpcserver/rpcserver.go:1531: handleExistsAddresses 100.0%
❯ go tool cover -func=cov.out | grep -E "handleGetCFilter"
github.com/decred/dcrd/internal/rpcserver/rpcserver.go:2318: handleGetCFilter 100.0%
❯ go tool cover -func=cov.out | grep -E "handleGetCFilterHeader"github.com/decred/dcrd/internal/rpcserver/rpcserver.go:2387: handleGetCFilterHeader 100.0%
❯ go tool cover -func=cov.out | grep -E "handleSearchRawTransactions"
github.com/decred/dcrd/internal/rpcserver/rpcserver.go:4382: handleSearchRawTransactions 99.3% |
|
e7dde47
to
d4d4d08
Compare
d4d4d08
to
a1184e5
Compare
a1184e5
to
df90022
Compare
df90022
to
c191657
Compare
674f5fb
to
71fd6ab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reviewed all updates and tested several cases such as dropping the transaction index and ensuring it also drops the dependent address index, performing a full reindex, and ensuring the blocking behavior works as intended.
The final review commit needs to be squashed into the appropriate commits to which the changes apply and then I'll get this merged.
Well, actually I see @rstaudt2 still has some changes requested that probably have been addressed, but I'd like to give him an opportunity to confirm before merging. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is missing calling RemoveSpendConsumerDependency
from the indexers (so spend journal entries are never removed for disconnected blocks when indexes are enabled).
Outside of that, this looks good overall to me. I also confirmed that the panic that can be reproduced on master
with these steps is now fixed.
Probably also due to Edit: Actually, it’s probably unrelated if you were just doing a full sync without disconnecting blocks. |
71fd6ab
to
d4f9a72
Compare
The relevant source from the profile:
In other words it's holding onto ~170M worth of block data. I haven't dug into it at all, but my first guess would be that the indexing is creating and holding on to references to the scripts within the blocks and because they're interior pointers, that naturally means the blocks can't be garbage collected. |
|
9f60f50
to
fc93788
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I retested with the latest and did not see any issues or significant difference in memory usage. Looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've reviewed the latest changes as well as tested everything and can confirm the memory issue is resolved. With all indexes enabled as well as both cpu and mem profiling, the peak usage reported by the OS on a full initial chain sync is 1.19GiB.
So, code and functionality wise, everything looks good to go.
The last outstanding thing is the commit messages. I see some of them aren't properly wrapped per the code contribution guidelines, and others look to be wrapped too short. Please amend them to make them properly conform and I'll get this merged.
Nice work!
Thanks, will get the outstanding issues sorted this evening. |
fc93788
to
21301c6
Compare
This removes index manager and moves helper functions defined with the manager to common.go.
This adds the IndexSubscriber which allows subscribing for index notifications as well as specifying an index notification dependency or prerequisite for a subscriber.
This removes the IndexManager interface and reworks the ChainQueryer and Indexer interfaces to facilitate index notification subscriptions and asynchronous index notifications.
This reworks the TransactionIndex to implement the updated Indexer interface for asynchoronous index updates. This index subscribes with no prequisites.
This updates the AddressIndex to implement the updated Indexer interface for index updates. The index however subscribes with the transaction index as a prerequisite. This also adds the spend consumer for use by indexers that need to lookup spend journal entries.
This reworks the ExistsAddressIndex to implement the updated Indexer interface for asynchoronous index updates. This index subscribes with no prerequisites.
This removes last set of references to IndexManager and integrates the IndexSubscriber into the chain. All indexes have now been updated to subscribe to the index subscriber for updates.
This updates rpc handlers to wait for their associated indexes to sync if needed before responding to a request.
This updates the blockImporter to use an IndexSubscriber in order to leverage async indexes.
This updates the index subscriber catchup function to update subscriptions directly with notification instead of going through the Run lifecycle process which avoids a case of receiving out of sync index notifications while catch up is in progress. Multiple typos in error type tests and review issues have also been fixed in this commit.
This integrates the spend pruner's RemoveSpendConsumerDependency method in the chaing query adapter to allow consumers access to the functionality. The address index has been updated as such to remove its spend journal dependencies when it processes block disconnections.
This refactors finding the lowest index tip height out of CatchUp into findLowestIndexTipHeight. The prevscripter snapshot now makes copies of the pkScript to avoid holding on to references too long during the catch up process.
21301c6
to
9a4400e
Compare
This reworks index updates by introducing an index subscriber which allows optional indexers to be asynchronously updated. The Index manager which synchronously updated optional indexes has been removed.
The changes made are as follows:
indexers: remove index manager.
the manager to
common.go
.indexers: add index subscriber.
This adds the
IndexSubscriber
which allows subscribing for indexnotifications as well specifying an index notification dependency or
prerequisite for a subscriber. The
IndexSubscriber
relays receivedIndexNtfn
s to itsIndexSubscription
s for processing.An Index subscription can either be prerequisite or a dependency to another
index subscription, however only independent subscriptions directly receive
index updates from the subscriber. Dependent subscriptions process index
notifications synchronously with their prerequisite subscription, they
receive index subscriptions after their prerequisites are done processing.
indexers: refactor interfaces.
IndexManager
interface and reworks theChainQueryer
andIndexer
interfaces to facilitate indexnotification subscriptions and asynchronous index notifications.
blockchain/indexers: async transaction index.
This reworks the
TransactionIndex
to implement the updatedIndexer
interface for asynchronous index updates. This index subscribes
with no prerequisites.
Associated tests for the transaction index have also been added.
blockchain/indexers: update address index.
This updates the
AddressIndex
to implement the updatedIndexer
interface for index updates. The index however subscribes with the
TransactionIndex
as a prerequisite.that need to lookup spend journal entries.
Associated tests for the address index have also been added.
blockchain/indexers: async transaction index.
This reworks the
ExistsAddressIndex
to implement the updatedIndexer
interface for asynchoronous index updates. This index subscribes
with no prerequisites.
Associated tests for the exists addresss index have also been added.
multi: integrate index subscriber.
IndexManager
and integratesthe
IndexSubscriber
into the chain. All indexes have now been updated tosubscribe to the index subscriber for updates.
internal/rpcserver: wait for sync on rpc request.
This updates rpc handlers to wait for their associated indexes to
sync if needed before responding to a request.
Associated handler tests have been updated to cover the added scenarios.
cmd/addblock: update block importer.
blockImporter
to use anIndexSubscriber
in order toleverage async indexes.
Resolves #1470.