Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync clarity and eliminate batch vouts.spend_tx_row_id update #1840

Merged
merged 6 commits into from
Jul 21, 2021

Conversation

chappjc
Copy link
Member

@chappjc chappjc commented Jul 19, 2021

This PR has been running on tip.dcrdata.org, the hidden service, testnet, and one mainnet backend.

This contains a few distinct improvements, all related to DB or sync and are thus stacked commits:

  • Each step of the startup sync is described in terms of stages. e.g. Beginning SYNC STAGE 1 of 6 (block data import)...
  • The batch update of the vouts.spend_tx_row_id column is eliminated and is done on-the-fly always, just like normal during operation. The mega query that did this update (see updateSpendTxInfoInAllVouts) during the initial sync step previously was surprisingly costly on slower drives. On an SSD, it would be less than 30 minutes, but on a HDD it ran for 12+ hours (not acceptable). Doing the update on the fly even for the initial block data import is acceptable since it updates with a condition on the primary key of the vouts table, and before the index on spend_tx_row_id is created. On my machine this increased the stage 1 time from 155 to about 165 minutes. This increase is likely to be higher on spinning disks, but it is more tolerable than an impractically large query at the end.
  • The address cache now relies on proactive address eviction by StoreBlock via FreshenAddressCaches. Previously, the DB layer would consider cached data as expired if the block returned by a cache query was less than the best block. But this had the effect of invalidating the entire cache when a new block was recorded, even if there were no transactions that would actually invalidate a cache item for a certain address. This is particularly important for keeping the legacy treasury entry valid since it is not always updated each block any more. Reorg and block disapproval are also rigged to evict affected addresses.
  • Search page handler optimizations:
    • Just check for addresses with DecodeAddress and redirect to the address page. Don't use the searchrawtransactions RPC or the AddressHistory DB method.
    • Only try proposals after tx, block, and address.
    • Only try block if not "utxo-like".
    • Only try tx if less than 5 bytes of leading/trailing zeros. NOTE: 10 hex zeros ('0') has never gotten close to happening for a random txid. The most ever is 0000031b3776cfb6ea658198a89e96a83abfc72a401552dee6fdc4e26d30f3f1. Would someone have any interest in hiding a tx from the search function by brute-forcing some element of the raw tx like an output amount? It would still be viewable on the /tx page or the containing /block page.
    • Only check DB for a tx in an orphaned block after everything else.

This also updates the README to require an SSD for the postgres process.

@chappjc chappjc changed the title [sharing] sync clarity and eliminate batch vouts.spend_tx_row_id update sync clarity and eliminate batch vouts.spend_tx_row_id update Jul 20, 2021
@chappjc chappjc marked this pull request as ready for review July 20, 2021 17:05
Don't miss cache if block info for a cache entry is old. Instead, rely
on StoreBlock to call FreshenAddressCaches with addresses to evict
from the cache.

validate addresses before DB query or cache update
Rework (*explorerUI).Search:
- don't use searchrawtransactions
- don't try AddressHistory query
- validate address before redirecting to address page
- only try proposals after tx, block, addr
- only try block if not utxo-like
- only try tx if less than 5 bytes of leading/trailing zeros
- only check DB for tx in orphaned block after everything else

TODO: figure out if "wrong net" can be determined for an address
rather than simple ErrUnknownAddressType, which means many things,
not just wrong network.

log gettxout error
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant