Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional transaction index to databases #2168

Merged
merged 1 commit into from
Jan 25, 2013

Conversation

sipa
Copy link
Member

@sipa sipa commented Jan 11, 2013

By specifying -txindex when initializing the database, a txid-to-diskpos index is maintained in the blktree database. This database is used to help answering getrawtransaction() RPC queries, when enabled.

Changing the -txindex value requires a -reindex; the client will abort at startup if the database and the specified -txindex mismatch.

)

CDiskTxPos(const CDiskBlockPos &blockIn, unsigned int nTxOffsetIn) : CDiskBlockPos(blockIn.nFile, blockIn.nPos), nTxOffset(nTxOffsetIn) {
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks a little weird, can you put that } at the end of the above line?

@Diapolo
Copy link

Diapolo commented Jan 11, 2013

How much will this increase a nodes load / how much larger will the resulting database be?

@sipa
Copy link
Member Author

sipa commented Jan 11, 2013

@Diapolo Around 500 MB extra in storage, and a lot of extra I/O. I didn't spend much effort optimizing this, as I don't consider this functionality a priority.

@Diapolo
Copy link

Diapolo commented Jan 11, 2013

@sipa Thanks, I was just interested in the technical base aspects :).

@mikehearn
Copy link
Contributor

Could you grab a service bit and make "getdata" use the new index too? Or is that too much additional work for this change?

@sipa
Copy link
Member Author

sipa commented Jan 14, 2013

@mikehearn I'm absolutely against making this available to the P2P network. If there is one thing I don't want services to depend on, then it is the availability of a fully indexed transaction history. If you really need one, fine, but maintain it yourself.

@BitcoinPullTester
Copy link

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/1f4691130ea0c54c69ceb04407507cd80dd05133 for binaries and test log.

@mikehearn
Copy link
Contributor

I'm not sure I see the issue. Yes you can make a node do lots of disk IO, but downloading the block chain does that too. It'd be nice to fix, but it's not a new attack vector.

Apps that need indexes like that will end up just using blockchain.info or various random other sites/protocols to get what they want. It's not like those apps will go away if the P2P network doesn't give them what they need.

@sipa
Copy link
Member Author

sipa commented Jan 14, 2013

The only thing keeping history is necessary for, is bootstrapping fully validating nodes. However, that doesn't require an index (or even a Bitcoin node at all - it could be provided by an HTTP-based file service or other protocols - something that web has plenty of). I'm sure that making a feature available on the P2P will result in infrastructure depending on it, something that would burden the nodes that provide archive data. Yes, making it an optional feature makes this much less of a problem, but I still prefer to be conservative with functions the P2P network provides unless there is a very clear use.

As far as I know, all versions before 0.7.0 maintained a full tx-to-diskpos index, and there was no way at all to query it (not even RPC). I'm quite sure this was a deliberate choice by Satoshi. In a mail about maintaining per-txout spendability of wallet transactions, he clearly said not to rely on the txindex for this, as it wouldn't always be available.

EDIT: I dislike this feature as whole (even as RPC), as I think that services that are built not to depend on infinite history have better scalability. It's extremely useful for debugging though, so I added it. I'm sure people will use it for other purposes as well, and that's clearly preferable to having them depend on a centralized service for it. Providing it via P2P doesn't add much to that, IMHO.

@mikehearn
Copy link
Contributor

OK, good points. Now I agree with you.

By specifying -txindex when initializing the database, a txid-to-diskpos
index is maintained in the blktree database. This database is used to
help answering getrawtransaction() RPC queries, when enabled.

Changing the -txindex value requires a -reindex; the client will abort
at startup if the database and the specified -txindex mismatch.
@BitcoinPullTester
Copy link

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/2d1fa42e85c9164688aa69b3f54f015fbefc06aa for binaries and test log.

@gavinandresen
Copy link
Contributor

Test plan that I'll finish executing after lunch:

Default arguments, getrawtransaction w. arbitrary old, not-in-your-wallet txid, fully-spent txn
EXPECT: transaction not found

Rerun with -txindex=1
EXPECT: startup fails with "must -reindex" message

Note disk space used by blktree/ subdirectory
Rerun with -txindex=1 -reindex=1 -logtimestamps=1
EXPECT: long background process to rebuild index (Q: how long?)
Wait until "Reindexing finished" appears in debug.log, then:
getrawtransaction
EXPECT: success
Q: How much extra disk space ?

Rerun with -txindex=0:
EXPECT: startup fails with "must -reindex" message

Note disk space used by blktree/
Rerun with -txindex=0 -reindex=1 -logtimestamps=1
EXPECT: long background process to rebuild index (Q: how long?)
Wait until "Reindexing finished" appears in debug.log, then:
getrawtransaction w. old txid
EXPECT: transaction not found
EXPECT: txindex disk space freed from blktree/

@sipa
Copy link
Member Author

sipa commented Jan 25, 2013

@gavinandresen RE test plan:

  • getrawtransaction without txindex should work for not-fully spent confirmed transactions, mempool transactions or transactions in the relay cache. For not-fully spent confirmed transactions, it may be slower than with txindex present. In general, without txindex I consider getrawtransaction to just work on a best-effort basis.
  • extra disk space caused by txindex will be in the blktree/ directory

Otherwise the plan looks correct and complete to me.

EDIT: -reindex doesn't cause a slow startup, the startup is always instant, but importing (and thus building the txindex) happens in the background.

@gavinandresen
Copy link
Contributor

Test results: success!

Disk space: blktree/ 30M --> 600M

Time to -reindex: (note: running -g build, no -dbcache set)
1 hour 50 minutes (same time, with/without -fullindex)

I'm going to pull.

gavinandresen added a commit that referenced this pull request Jan 25, 2013
Add optional transaction index to databases
@gavinandresen gavinandresen merged commit 63cc766 into bitcoin:master Jan 25, 2013
@sipa sipa deleted the txindex branch May 3, 2013 18:51
@aaronash
Copy link

Just wanted to say thanks for working on this. Being able to getrawtransaction on pretty much anything removes reliance on 3rd party services, which I find very valuable.

laudney pushed a commit to reddcoin-project/reddcoin-3.10 that referenced this pull request Mar 19, 2014
Add optional transaction index to databases
owlhooter pushed a commit to owlhooter/mazacoin-new that referenced this pull request Oct 11, 2018
* Remove testnet seeds from devnet

* Lift multiple ports restriction on devnet when considering new nodes

Allow to connect to multiple nodes behind the same IP

* Don't skip addresses with non-default port if it matches -port

If the user specified -port, he very likely intends to connect to nodes
with the same port.

* Don't pass false to CAddrMan constructor as it is already the default

* Make if statements easier to read
guruvan added a commit to guruvan/maza that referenced this pull request Nov 8, 2018
* MAZA-POS: (5575 commits)
  Mazafication of code More mazafication More mazafications and compile correction fixes fix for build issues fix for build issues fix string in net.cpp correct pow.cpp correct validation.cpp fixing for build errors fix typo Merge remote-tracking branch 'origin/MAZA-POS' into MAZA-POS
  merge to dash rebase
  Release notes 0.12.3.3
  Remove redundant parameter fCheckDuplicateInputs from CheckTransaction
  Fix crash bug with duplicate inputs within a transaction
  Bump to 0.12.3.3
  Release notes 0.12.3.2 (bitcoin#2174)
  Add tests for special rules for slow blocks on devnet/testnet (bitcoin#2176)
  Allow mining min diff for very slow (2h+) blocks (bitcoin#2175)
  Fix issues with selections on Masternode tab (bitcoin#2170)
  Sync mn list and mnw list from 3 peers max (bitcoin#2169)
  A few devnet related fixes (bitcoin#2168)
  Adjust diff for slow testnet/devnet blocks a bit smoother (bitcoin#2161)
  Make PS Buttons not react to spacebar (bitcoin#2154)
  Bump to 0.12.3.2 (bitcoin#2173)
  Bump to 0.12.3.1 (bitcoin#2158)
  Update release notes (bitcoin#2155)
  Use correct protocol when serializing messages in reply to `getdata` (bitcoin#2157)
  Fix p2pkh tests asserts (bitcoin#2153)
  Fix block value/payee validation in lite mode (bitcoin#2148)
  ...
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants