Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autoprune #4701

Closed
wants to merge 9 commits into from
Closed

Autoprune #4701

wants to merge 9 commits into from

Conversation

@rdponticelli
Copy link
Contributor

rdponticelli commented Aug 14, 2014

This pull implements a new mode of operation which automatically removes old block files trying to maintain at most a maximum amount of disk space used by the node. This amount is configured by the user with the -prune switch.

There's also a lightweight sanity check which executes periodically during runtime to make sure the minimum block files required for the node to be operative are present.

This should allow to lower the amount of resources needed to run a node.

See the individual commits, about all the changes introduced.

@gmaxwell

This comment has been minimized.

Copy link
Contributor

gmaxwell commented Aug 14, 2014

Very cool. I went to go make some random suggestions but found you implemented them already. I'll give this more review soon.

@laanwj laanwj added the Feature label Aug 18, 2014
@sipa
sipa reviewed Aug 23, 2014
View changes
src/init.cpp Outdated
@@ -225,6 +225,7 @@ std::string HelpMessage(HelpMessageMode mode)
strUsage += " -maxorphanblocks=<n> " + strprintf(_("Keep at most <n> unconnectable blocks in memory (default: %u)"), DEFAULT_MAX_ORPHAN_BLOCKS) + "\n";
strUsage += " -par=<n> " + strprintf(_("Set the number of script verification threads (%u to %d, 0 = auto, <0 = leave that many cores free, default: %d)"), -(int)boost::thread::hardware_concurrency(), MAX_SCRIPTCHECK_THREADS, DEFAULT_SCRIPTCHECK_THREADS) + "\n";
strUsage += " -pid=<file> " + _("Specify pid file (default: bitcoind.pid)") + "\n";
strUsage += " -pruned " + _("Run in a pruned state") + "\n";

This comment has been minimized.

Copy link
@sipa

sipa Aug 23, 2014

Member

Nit: "Prune old blocks" may be an easier explanation for the flag.

@sipa
sipa reviewed Aug 23, 2014
View changes
src/main.cpp Outdated
if (fPruned) {
// Ignore requests which ask us for blocks we don't have any more
// Peers shouldn't ask, anyway, as we unset NODE_NETWORK on this mode.
LogPrintf("cannot load block from disk, ignoring request from peer=%d\n", pfrom->id);

This comment has been minimized.

Copy link
@sipa

sipa Aug 23, 2014

Member

I think you should disconnect such peers (for their sake) rather than just ignoring.

This comment has been minimized.

Copy link
@rdponticelli

rdponticelli Aug 24, 2014

Author Contributor

Added rejection and disconnection.

@TheBlueMatt
TheBlueMatt reviewed Aug 23, 2014
View changes
src/init.cpp Outdated
else
return InitError(_("Can't run with a wallet in pruned mode."));
}
#endif

This comment has been minimized.

Copy link
@TheBlueMatt

TheBlueMatt Aug 23, 2014

Contributor

Actually, I might just always InitError here. Its not obvious that -pruned will disable wallet from the explanation given and it may be overly confusing.

This comment has been minimized.

Copy link
@rdponticelli

rdponticelli Aug 24, 2014

Author Contributor

A proper explanation has been added.

@luke-jr
luke-jr reviewed Aug 23, 2014
View changes
src/init.cpp Outdated
@@ -950,6 +961,14 @@ bool AppInit2(boost::thread_group& threadGroup)
break;
}

if (!CheckAndPruneBlockFiles()) {
if (!fPruned)
strLoadError = _("Error checking required block files. You might try to run in -pruned mode, or try to rebuild the block database");

This comment has been minimized.

Copy link
@luke-jr

luke-jr Aug 23, 2014

Member

Should probably warn of the destructive nature of -pruned?

@luke-jr
luke-jr reviewed Aug 23, 2014
View changes
src/main.cpp Outdated
bool CheckAndPruneBlockFiles()
{
// Check presence of essential data
int nKeepBlksFromHeight = fPruned ? (max((int)(chainActive.Height() - MIN_BLOCKS_TO_KEEP), 0)) : 0;

This comment has been minimized.

Copy link
@luke-jr

luke-jr Aug 23, 2014

Member

Maybe this should be configurable? -pruned 500 to keep all blocks in the main chain at height 500+ vs -pruned -500 to keep the last 500?

This comment has been minimized.

Copy link
@rdponticelli

rdponticelli Sep 2, 2014

Author Contributor

I'm working on something like this. The code is still messy, but I would wait for it to be ready before merging this. The default would be -pruned=0 which means do now autoprune, just allow running pruned.

@luke-jr
luke-jr reviewed Aug 23, 2014
View changes
src/init.cpp Outdated
if (SoftSetBoolArg("-disablewallet", true))
LogPrintf("AppInit2 : parameter interaction: -pruned=1 -> setting -disablewallet=1\n");
else
return InitError(_("Can't run with a wallet in pruned mode."));

This comment has been minimized.

Copy link
@luke-jr

luke-jr Aug 23, 2014

Member

Why not?

This comment has been minimized.

Copy link
@rdponticelli

rdponticelli Aug 24, 2014

Author Contributor

Because otherwise people can very easily shoot themselves in the foot by plugging in an old wallet which silently fails rescanning, getting in an inconsistent state.

This comment has been minimized.

Copy link
@luke-jr

luke-jr Aug 24, 2014

Member

Could just fail if it needs a rescan...

This comment has been minimized.

Copy link
@rdponticelli

rdponticelli Aug 24, 2014

Author Contributor

Yeah, but I would better leave that for another pull, when such a case has been more extensively tested.

@gmaxwell

This comment has been minimized.

Copy link
Contributor

gmaxwell commented Aug 24, 2014

luke: wrt pruning depth, probably what would be good eventually is a size target and then the software can make use of the size target usefully... but I don't know that it makes sense at this point since we don't yet have a good way to make use of a sparse blockchain.

@luke-jr

This comment has been minimized.

Copy link
Member

luke-jr commented Aug 24, 2014

We have to use it for reorgs. Setting a default prune depth is probably dangerous enough to becoming an (inconsistent) consensus rule already.

@gmaxwell

This comment has been minimized.

Copy link
Contributor

gmaxwell commented Aug 24, 2014

By sparse I mean containing any blocks other than the last N. If you'll note, the number 288 above comes from my comments on the prior PR as a minimum number I'd consider acceptable as an absolute minimum for the purpose of reorgs.

@rdponticelli rdponticelli force-pushed the Criptomonedas:autoprune branch 3 times, most recently Aug 24, 2014
@sipa
sipa reviewed Aug 26, 2014
View changes
src/main.cpp Outdated
@@ -2881,11 +2884,17 @@ bool CheckDiskSpace(uint64_t nAdditionalBytes)
return true;
}

boost::filesystem::path GetBlockFilePath(const CDiskBlockPos &pos, const char *prefix)
{
boost::filesystem::path path = GetDataDir() / "blocks" / strprintf("%s%05u.dat", prefix, pos.nFile);

This comment has been minimized.

Copy link
@sipa

sipa Aug 26, 2014

Member

Nit: no need for the intermediate variable.

@sipa
sipa reviewed Aug 26, 2014
View changes
src/main.cpp Outdated
@@ -2912,6 +2921,14 @@ FILE* OpenUndoFile(const CDiskBlockPos &pos, bool fReadOnly) {
return OpenDiskFile(pos, "rev", fReadOnly);
}

bool RemoveBlockFile(const CDiskBlockPos &pos) {
return boost::filesystem::remove(GetBlockFilePath(pos, "blk")) ? true : false;

This comment has been minimized.

Copy link
@sipa

sipa Aug 26, 2014

Member

No need for the ? true : false.
boost::filesystem::remove already returns a bool.

@sipa
sipa reviewed Aug 26, 2014
View changes
src/main.cpp Outdated
}

bool RemoveUndoFile(const CDiskBlockPos &pos) {
return boost::filesystem::remove(GetBlockFilePath(pos, "rev")) ? true : false;

This comment has been minimized.

Copy link
@sipa

sipa Aug 26, 2014

Member

Same.

@sipa
sipa reviewed Aug 26, 2014
View changes
src/main.cpp Outdated
if (fPruned) {
// Reject requests and disconnect peers asking us for blocks we don't have.
// Doing so we avoid stalling their IBD but they shouldn't ask as we unset NODE_NETWORK on this mode.
LogPrintf("cannot load block from disk, rejecting request and disconnectiong peer:%d\n", pfrom->id);

This comment has been minimized.

Copy link
@sipa

sipa Aug 26, 2014

Member

disconnecting

@sipa
sipa reviewed Aug 26, 2014
View changes
src/main.cpp Outdated
if (chainActive.Height() > AUTOPRUNE_AFTER_HEIGHT && (int)info.nHeightLast < nKeepBlksFromHeight) {
if (RemoveBlockFile(pos)) {
LogPrintf("File blk%05u.dat removed\n", pindex->nFile);
mapBlkDataFileReadable.insert(make_pair(pindex->nFile, false));

This comment has been minimized.

Copy link
@sipa

sipa Aug 26, 2014

Member

mapBlkDataFileReadable[pindex->nFile] = false;

etc

@sipa
sipa reviewed Aug 26, 2014
View changes
src/main.cpp Outdated
// Reject requests and disconnect peers asking us for blocks we don't have.
// Doing so we avoid stalling their IBD but they shouldn't ask as we unset NODE_NETWORK on this mode.
LogPrintf("cannot load block from disk, rejecting request and disconnectiong peer:%d\n", pfrom->id);
pfrom->PushMessage("reject", string("getdata"), REJECT_NOTFOUND, string("Not found"));

This comment has been minimized.

Copy link
@sipa

sipa Aug 26, 2014

Member

notfound message instead of reject?

@rdponticelli rdponticelli force-pushed the Criptomonedas:autoprune branch 2 times, most recently Aug 26, 2014
@sipa

This comment has been minimized.

Copy link
Member

sipa commented Aug 26, 2014

When a block is being disconnected due to a reorg, and its data cannot be loaded from disk, there is currently just a state.Abort with "Failed to read block". Exceedingly unlikely, but we need to be able to deal with such situations. I wonder whether crashing with some extra help/debug output may be enough, or whether we need to retry downloading the missing data...

EDIT: Downloading the missing data might work for block data, but not for undo data, so it will be unlikely to be useful.

@gmaxwell

This comment has been minimized.

Copy link
Contributor

gmaxwell commented Aug 26, 2014

I'd like to re-download, and thought that would be interesting to explore with headers first in place— but the problem is that if the undo data is deleted we cannot usefully redownload.

Edit: ah, you noticed that. Yea, well— we could have different retention policies for undo date. I considered that future work. If ever we make the undo data normative we could just store hashes of it and fetch it from peers too.

@sipa

This comment has been minimized.

Copy link
Member

sipa commented Aug 26, 2014

Of course, we could for example delete blocks data at depth N, but only delete undo data at depth N_3 or so (undo data is 7-10 times smaller than block data). Of course, that just moves the problem further to what to do when an N_3 deep reorg is encountered.

@sipa

This comment has been minimized.

Copy link
Member

sipa commented Aug 27, 2014

Untested ACK. I guess I'm fine with resolving the missing-block/undo problem for reorgs later.

@rdponticelli rdponticelli force-pushed the Criptomonedas:autoprune branch Aug 27, 2014
@rdponticelli rdponticelli force-pushed the Criptomonedas:autoprune branch 2 times, most recently Aug 28, 2014
These are the main functional changes on this state:

* Do not allow running with a wallet or txindex.
* Check for data at startup is mandatory only up to the last 288 blocks.
* NODE_NETWORK flag is unset.
* Requests for pruned blocks from other peers is answered with "notfound" and they are disconnected, not to stall their IBD.
This mode introduces a configuration parameter to keep block files at less than a fixed amount of MiB.
We can do it now that the logic to avoid opening the files several times has been
moved to their own functions and is handled mainly through variables.
@rdponticelli rdponticelli force-pushed the Criptomonedas:autoprune branch from f53d60e to bbb769c Jan 12, 2015
@rdponticelli

This comment has been minimized.

Copy link
Contributor Author

rdponticelli commented Jan 12, 2015

Rebased.

BOOST_FOREACH(PairType& pair, merkleBlock.vMatchedTxn)
if (!pfrom->setInventoryKnown.count(CInv(MSG_TX, pair.second)))
pfrom->PushMessage("tx", block.vtx[pair.first]);
if (!ReadBlockFromDisk(block, (*mi).second)) {

This comment has been minimized.

Copy link
@theuni

theuni Jan 12, 2015

Member

It seems this doesn't make the distinction between missing a pruned block and a failed read. If a non-pruned block fails to read when pruning is enabled, shouldn't we fail as before?

Alternatively.. couldn't the pruning check happen before ReadBlockFromDisk(), to avoid the overhead entirely for pruned nodes? If we're comfortable with randomly answering with a notfound, why not do it constantly?

This comment has been minimized.

Copy link
@sipa

sipa Feb 16, 2015

Member

There is a distinction in the expectation of what a node does. If you enable pruning, the node does not promise to the network to behave as a full node, so it's fine to not answer. If a node advertizes as NODE_NETWORK, and can't answer a request for a block, it's buggy.

@shivaenigma

This comment has been minimized.

Copy link

shivaenigma commented Jan 29, 2015

Testing this from https://github.com/luke-jr/bitcoin/tree/0.10.0rc3.autoprune
Just curious on what is the desired behaviour of -reindex when run on already pruned blocks ?

@Michagogo

This comment has been minimized.

Copy link
Contributor

Michagogo commented Jan 29, 2015

I would assume reindexing would force it to redownload all the blocks from
scratch.

On Thursday, January 29, 2015, shivaenigma notifications@github.com wrote:

Testing this from
https://github.com/luke-jr/bitcoin/tree/0.10.0rc3.autoprune
Just curious on what is the desired behaviour of -reindex when run on
already pruned blocks ?


Reply to this email directly or view it on GitHub
#4701 (comment).

@mrbandrews

This comment has been minimized.

Copy link
Contributor

mrbandrews commented Jan 29, 2015

Hi. I've been testing this also (building from source) and I think the latest commit may have re-introduced the issue of re-opening a block and undo file for each block in the active chain. Thus, on testnet (about 320k blocks) each call to CheckBlockFiles results in 640k calls to the file system. I know that @rdponticelli has a separate PR (#4515) which appears to still have the code which should prevent this (using setRequiredDataFilesAreOpenable) - and which autoprune may eventually be built on top of.

It seems from the comment on the last commit that it was intended that this check was moved into a different function, but if so, it doesn't seem to be working as intended?

@21E14

This comment has been minimized.

Copy link
Contributor

21E14 commented Feb 1, 2015

This has been tagged as v0.11. What time frame is that indicative of?

@gmaxwell

This comment has been minimized.

Copy link
Contributor

gmaxwell commented Feb 1, 2015

@21E14 presumably and hopefully in the next couple months. Right now much attention is focused on getting 0.10 out (as it should be), after that you should expect to see more attention on getting this merged from the rest of the contributors.

@laanwj

This comment has been minimized.

Copy link
Member

laanwj commented Feb 1, 2015

@21E14 July 2015 is the time frame for 0.11. The tag is no guarantee that it will make it into that release though, but just a reminder. If it isn't ready to merge well before 0.11's release date it will be bumped to 0.12.

You can help by testing and reviewing the code.

@21E14

This comment has been minimized.

Copy link
Contributor

21E14 commented Feb 2, 2015

@gmaxwell @laanwj I'm assuming a few minor releases in-between?

This PR is looking pretty good so far. Running the daemon though, just for kicks, with the prune option set to less than 300 MiB results in the following awkward message:


AppInit2 : parameter interaction: -prune -> setting -disablewallet=1
Autoprune configured below the minimum of 300MiB. Setting at the maximum possible of 17592186044415MiB, to avoid pruning too much. Please, check your configuration.


More to the point, why even let a 'misconfigured prune' carry on?

@shivaenigma

This comment has been minimized.

Copy link

shivaenigma commented Feb 4, 2015

Did more testing from https://github.com/luke-jr/bitcoin/tree/0.10.0rc3.autoprune
Tested using -prune=300 and blocks are getting deleted

But sometimes size of .bitcoin/blocks is more than much more than 300
2015-02-04 17:34:59 Data for blocks from 1 to 184525 has been pruned 2015-02-04 17:34:59 Undo data for blocks from 1 to 184525 has been pruned
du -sh ~/.bitcoin/blocks/ is 496MB

Switching from pruned mode to nonpruned mode causes this:
Error checking required block files. There must be missing or unreadable data. Do you want to rebuild the block database now?

@Michagogo

This comment has been minimized.

Copy link
Contributor

Michagogo commented Feb 4, 2015

@shivaenigma What's the size of the index/ directory? Perhaps that's the rest. I don't know if the index shrinks with a pruned node. A couple hundred megabytes is insignificant compared to 30+ GB, but indeed, with pruned nodes like this that does become a factor. And at the end, do you mean switching to non-pruned mode? In that case, yes, of course you're missing data -- you've just deleted most of the blockchain!

@shivaenigma

This comment has been minimized.

Copy link

shivaenigma commented Feb 4, 2015

@Michagogo
size of my blocks/index/ 34MB. I think the parameter -pruned=300MB is misleading, even if after pruning the size end of 496MB . How does the error scale, so if I set 2GB will it take 2.1GB or 3GB

so I guess its checking all the blocks at startup on non pruned mode and throws an error. I think there should be way to disable this check. Because now I can never switch from pruned mode to nonpruned mode even if I dont care about missing inital blocks

@Michagogo

This comment has been minimized.

Copy link
Contributor

Michagogo commented Feb 4, 2015

Uh, what? By definition, non-pruned means that you have the entire
blockchain. So yes, you do need to redownload it. It would be nice, though,
if it were smart enough to make the switch gracefully and just fill in the
history, with the headers-first mechanism. If you mean you don't want to
prune more, you can just set the threshold to something high (100000000
or whatever) and you'll be covered for the foreseeable future.

On Wed, Feb 4, 2015 at 8:54 PM, shivaenigma notifications@github.com
wrote:

@Michagogo https://github.com/Michagogo
size of my blocks/index/ 34MB. I think the parameter -pruned=300MB is
misleading, even if after pruning the size end of 496MB . How does the
error scale, so if I set 2GB will it take 2.1GB or 3GB

so I guess its checking all the blocks at startup on non pruned mode and
throws an error. I think there should be way to disable this check. Because
now I can never switch from pruned mode to nonpruned mode even if I dont
care about missing inital blocks


Reply to this email directly or view it on GitHub
#4701 (comment).

@shivaenigma

This comment has been minimized.

Copy link

shivaenigma commented Feb 5, 2015

If you mean you don't want to prune more, you can just set the threshold to something high (100000000 or whatever) and you'll be covered for the foreseeable future

Yes this is actually what I wanted . Thanks

LogPrintf("Autoprune configured to use less than %uMiB on disk for block files.\n", nPrune / 1024 / 1024);
else {
nPrune = ~0;
LogPrintf("Autoprune configured below the minimum of %uMiB. Setting at the maximum possible of %uMiB, to avoid pruning too much. Please, check your configuration.\n", MIN_BLOCK_FILES_SIZE / 1024 / 1024, nPrune / 1024 / 1024);

This comment has been minimized.

Copy link
@sipa

sipa Feb 16, 2015

Member

I think it's more clear if you just say something like "Leaving pruned mode enabled, but not deleting any more blocks for now".

return false;
}
bool fFileRemoved = false;
int dataPrunable = *setDataFilePrunable.begin(), undoPrunable = *setUndoFilePrunable.begin();

This comment has been minimized.

Copy link
@sipa

sipa Feb 16, 2015

Member

Is it guaranteed that both setDataFilePrunable and setUndoFilePrunable are both non-empty at this stage? The test above only guarantees that at least one of them is non-empty. Don't dereference begin() of an empty set.

@@ -2886,7 +2979,7 @@ bool static LoadBlockIndexDB()
{
CBlockIndex* pindex = item.second;
pindex->nChainWork = (pindex->pprev ? pindex->pprev->nChainWork : 0) + GetBlockProof(*pindex);
if (pindex->nStatus & BLOCK_HAVE_DATA) {
if (pindex->nStatus & BLOCK_HAVE_DATA || pindex->nStatus & BLOCK_VALID_CHAIN) {

This comment has been minimized.

Copy link
@sipa

sipa Feb 16, 2015

Member

This may lose never-fully-connected branches. I think this condition as a whole should just change to (pindex->nStatus & BLOCK_VALID_TRANSACTIONS && pindex->nTx != 0).

}
}
}
if (~pindex->nStatus & BLOCK_HAVE_DATA && pindex->nStatus & BLOCK_VALID_CHAIN)

This comment has been minimized.

Copy link
@sipa

sipa Feb 16, 2015

Member

This shouldn't require BLOCK_VALID_CHAIN anymore; I think BLOCK_VALID_TRANSACTIONS is enough.

set<int> setDataPruned, setUndoPruned;
setDataFilePrunable.clear();
setUndoFilePrunable.clear();
for (CBlockIndex* pindex = chainActive.Tip(); pindex && pindex->pprev; pindex = pindex->pprev) {

This comment has been minimized.

Copy link
@sipa

sipa Feb 16, 2015

Member

This will not iterate over side branches of the block chain, and may perhaps miss things that way. The chances for that are really low, as this code still checks the last height in each affected file, and then deletes whole files, so unless a file consists solely of side branch blocks, it will still be processed.

However, I think this flaw indicates a conceptual mistake: this function should iterate over files, rather than over blocks. That would be faster too. The concistency check can be done on blocks separately, but only needs to be done for the last blocks I guess?

@sipa

This comment has been minimized.

Copy link
Member

sipa commented Mar 1, 2015

Do you plan to work on this any more in the future? If not, I may try to maintain/update it.

@laanwj

This comment has been minimized.

Copy link
Member

laanwj commented Mar 9, 2015

Closing in favor of #5863

@laanwj laanwj closed this Mar 9, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.