-
Notifications
You must be signed in to change notification settings - Fork 35.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autoprune #4701
Autoprune #4701
Conversation
Very cool. I went to go make some random suggestions but found you implemented them already. I'll give this more review soon. |
@@ -225,6 +225,7 @@ std::string HelpMessage(HelpMessageMode mode) | |||
strUsage += " -maxorphanblocks=<n> " + strprintf(_("Keep at most <n> unconnectable blocks in memory (default: %u)"), DEFAULT_MAX_ORPHAN_BLOCKS) + "\n"; | |||
strUsage += " -par=<n> " + strprintf(_("Set the number of script verification threads (%u to %d, 0 = auto, <0 = leave that many cores free, default: %d)"), -(int)boost::thread::hardware_concurrency(), MAX_SCRIPTCHECK_THREADS, DEFAULT_SCRIPTCHECK_THREADS) + "\n"; | |||
strUsage += " -pid=<file> " + _("Specify pid file (default: bitcoind.pid)") + "\n"; | |||
strUsage += " -pruned " + _("Run in a pruned state") + "\n"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: "Prune old blocks" may be an easier explanation for the flag.
luke: wrt pruning depth, probably what would be good eventually is a size target and then the software can make use of the size target usefully... but I don't know that it makes sense at this point since we don't yet have a good way to make use of a sparse blockchain. |
We have to use it for reorgs. Setting a default prune depth is probably dangerous enough to becoming an (inconsistent) consensus rule already. |
By sparse I mean containing any blocks other than the last N. If you'll note, the number 288 above comes from my comments on the prior PR as a minimum number I'd consider acceptable as an absolute minimum for the purpose of reorgs. |
9b08385
to
8ba326b
Compare
@@ -2881,11 +2884,17 @@ bool CheckDiskSpace(uint64_t nAdditionalBytes) | |||
return true; | |||
} | |||
|
|||
boost::filesystem::path GetBlockFilePath(const CDiskBlockPos &pos, const char *prefix) | |||
{ | |||
boost::filesystem::path path = GetDataDir() / "blocks" / strprintf("%s%05u.dat", prefix, pos.nFile); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: no need for the intermediate variable.
8e778c8
to
c4d4ed9
Compare
When a block is being disconnected due to a reorg, and its data cannot be loaded from disk, there is currently just a state.Abort with "Failed to read block". Exceedingly unlikely, but we need to be able to deal with such situations. I wonder whether crashing with some extra help/debug output may be enough, or whether we need to retry downloading the missing data... EDIT: Downloading the missing data might work for block data, but not for undo data, so it will be unlikely to be useful. |
I'd like to re-download, and thought that would be interesting to explore with headers first in place— but the problem is that if the undo data is deleted we cannot usefully redownload. Edit: ah, you noticed that. Yea, well— we could have different retention policies for undo date. I considered that future work. If ever we make the undo data normative we could just store hashes of it and fetch it from peers too. |
Of course, we could for example delete blocks data at depth N, but only delete undo data at depth N_3 or so (undo data is 7-10 times smaller than block data). Of course, that just moves the problem further to what to do when an N_3 deep reorg is encountered. |
Untested ACK. I guess I'm fine with resolving the missing-block/undo problem for reorgs later. |
c4d4ed9
to
8065cc4
Compare
597ee8e
to
5c93fb5
Compare
These are the main functional changes on this state: * Do not allow running with a wallet or txindex. * Check for data at startup is mandatory only up to the last 288 blocks. * NODE_NETWORK flag is unset. * Requests for pruned blocks from other peers is answered with "notfound" and they are disconnected, not to stall their IBD.
This mode introduces a configuration parameter to keep block files at less than a fixed amount of MiB.
We can do it now that the logic to avoid opening the files several times has been moved to their own functions and is handled mainly through variables.
f53d60e
to
bbb769c
Compare
Rebased. |
BOOST_FOREACH(PairType& pair, merkleBlock.vMatchedTxn) | ||
if (!pfrom->setInventoryKnown.count(CInv(MSG_TX, pair.second))) | ||
pfrom->PushMessage("tx", block.vtx[pair.first]); | ||
if (!ReadBlockFromDisk(block, (*mi).second)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this doesn't make the distinction between missing a pruned block and a failed read. If a non-pruned block fails to read when pruning is enabled, shouldn't we fail as before?
Alternatively.. couldn't the pruning check happen before ReadBlockFromDisk(), to avoid the overhead entirely for pruned nodes? If we're comfortable with randomly answering with a notfound, why not do it constantly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a distinction in the expectation of what a node does. If you enable pruning, the node does not promise to the network to behave as a full node, so it's fine to not answer. If a node advertizes as NODE_NETWORK, and can't answer a request for a block, it's buggy.
Testing this from https://github.com/luke-jr/bitcoin/tree/0.10.0rc3.autoprune |
I would assume reindexing would force it to redownload all the blocks from On Thursday, January 29, 2015, shivaenigma notifications@github.com wrote:
|
Hi. I've been testing this also (building from source) and I think the latest commit may have re-introduced the issue of re-opening a block and undo file for each block in the active chain. Thus, on testnet (about 320k blocks) each call to CheckBlockFiles results in 640k calls to the file system. I know that @rdponticelli has a separate PR (#4515) which appears to still have the code which should prevent this (using setRequiredDataFilesAreOpenable) - and which autoprune may eventually be built on top of. It seems from the comment on the last commit that it was intended that this check was moved into a different function, but if so, it doesn't seem to be working as intended? |
This has been tagged as v0.11. What time frame is that indicative of? |
@21E14 presumably and hopefully in the next couple months. Right now much attention is focused on getting 0.10 out (as it should be), after that you should expect to see more attention on getting this merged from the rest of the contributors. |
@21E14 July 2015 is the time frame for 0.11. The tag is no guarantee that it will make it into that release though, but just a reminder. If it isn't ready to merge well before 0.11's release date it will be bumped to 0.12. You can help by testing and reviewing the code. |
@gmaxwell @laanwj I'm assuming a few minor releases in-between? This PR is looking pretty good so far. Running the daemon though, just for kicks, with the prune option set to less than 300 MiB results in the following awkward message: AppInit2 : parameter interaction: -prune -> setting -disablewallet=1 More to the point, why even let a 'misconfigured prune' carry on? |
Did more testing from https://github.com/luke-jr/bitcoin/tree/0.10.0rc3.autoprune But sometimes size of .bitcoin/blocks is more than much more than 300 Switching from pruned mode to nonpruned mode causes this: |
@shivaenigma What's the size of the index/ directory? Perhaps that's the rest. I don't know if the index shrinks with a pruned node. A couple hundred megabytes is insignificant compared to 30+ GB, but indeed, with pruned nodes like this that does become a factor. And at the end, do you mean switching to non-pruned mode? In that case, yes, of course you're missing data -- you've just deleted most of the blockchain! |
@Michagogo so I guess its checking all the blocks at startup on non pruned mode and throws an error. I think there should be way to disable this check. Because now I can never switch from pruned mode to nonpruned mode even if I dont care about missing inital blocks |
Uh, what? By definition, non-pruned means that you have the entire On Wed, Feb 4, 2015 at 8:54 PM, shivaenigma notifications@github.com
|
Yes this is actually what I wanted . Thanks |
LogPrintf("Autoprune configured to use less than %uMiB on disk for block files.\n", nPrune / 1024 / 1024); | ||
else { | ||
nPrune = ~0; | ||
LogPrintf("Autoprune configured below the minimum of %uMiB. Setting at the maximum possible of %uMiB, to avoid pruning too much. Please, check your configuration.\n", MIN_BLOCK_FILES_SIZE / 1024 / 1024, nPrune / 1024 / 1024); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's more clear if you just say something like "Leaving pruned mode enabled, but not deleting any more blocks for now".
Do you plan to work on this any more in the future? If not, I may try to maintain/update it. |
Closing in favor of #5863 |
This pull implements a new mode of operation which automatically removes old block files trying to maintain at most a maximum amount of disk space used by the node. This amount is configured by the user with the -prune switch.
There's also a lightweight sanity check which executes periodically during runtime to make sure the minimum block files required for the node to be operative are present.
This should allow to lower the amount of resources needed to run a node.
See the individual commits, about all the changes introduced.