Upgrading to 0.8: re-use blkNNNN.dat files. #2099

Merged
merged 1 commit into from Jan 14, 2013

Conversation

Projects
None yet
7 participants
Contributor

gavinandresen commented Dec 12, 2012

This uses boost::filesystem::create_hard_link to hard-link the pre-0.8 blkNNNN.dat files to blocks/blkNNNNN-1.dat

A hard link is the semantics we want: a copy would use twice the disk space, and a move would mean you have to re-download blocks if you switch back to 0.7.

The hard link failing is a soft error-- in that case, you just re-download the blocks.

According to my research, this should work on Windows, unless you're running a FAT32 filesystem.

Member

luke-jr commented Dec 13, 2012

Looks good to me. Note, however, that ultraprune currently has problems in some cases with hard linked files. IMO those should be fixed regardless, so not a blocker for this.

Owner

sipa commented Dec 13, 2012

@luke-jr I only know of problems with read-only files. Whether a file is hard-linked shouldn't be even be observable by applications (except for the output of stat()).

+ } catch (filesystem::filesystem_error & e) {
+ // Note: hardlink creation failing is not a disaster, it just means
+ // blocks will get re-downloaded from peers.
+ printf("Error hardlinking blk%04u.dat : %s\n", i, e.what());
@Diapolo

Diapolo Dec 13, 2012

IMHO this could be an InitWarning() and the message should be a little more user friendly and translatable.

@gavinandresen

gavinandresen Dec 13, 2012

Contributor

I don't want users to be told that hardlinking failed-- most end-users won't know what a hard-link is, and if it DOES fail I don't expect them to do anything (they'll just fall-back to downloading the whole chain again).

Owner

sipa commented Dec 13, 2012

This is definitely the easiest solution imaginable, assuming it works for Windows users.

Importing instead of reindexing is probably sometimes a better choice (it helps with heavily fragmented block files, for example, and results in the intended smaller files), but I suppose those cases can be handled manually.

Automatic sanity-testing: FAILED BUILD/TEST, see http://jenkins.bluematt.me/pull-tester/662bdaa9c36c0b0da29edd82fa9bebbee9a1ecaa for binaries and test log.

This could happen for one of several reasons:

  1. It chanages paths in makefile.linux-mingw or otherwise changes build scripts in a way that made them incompatible with the automated testing scripts
  2. It does not build on either Linux i386 or Win32 (via MinGW cross compile)
  3. The test suite fails on either Linux i386 or Win32
  4. The block test-cases failed (lookup the first bNN identifier which failed in https://github.com/TheBlueMatt/test-scripts/blob/master/FullBlockTestGenerator.java)
src/init.cpp
+ filesystem::path dest = blocksDir / strprintf("blk%05u.dat", i-1);
+ try {
+ filesystem::create_hard_link(source, dest);
+ printf("Hardlinked %s -> %s\n", source.c_str(), dest.c_str());
@sipa

sipa Dec 16, 2012

Owner

source.string().c_str(), etc. if you want to be compatible with older filesystem boost lib.

Contributor

gavinandresen commented Dec 16, 2012

Thanks @sipa -- updated to use string().c_str() and rebased.

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/f4445f9982a760869c430f3d4b1302f7eb509bd8 for binaries and test log.

Contributor

rebroad commented Dec 23, 2012

If the linking fails, can it copy or move the file instead rather than download from the genesis block?

Member

luke-jr commented Dec 23, 2012

I wonder how difficult it would be to have the code just use the old blk000?.dat files directly when they exist.

Owner

sipa commented Dec 24, 2012

The reason to move to a new naming scheme was:

  • to have them in a separate directory, so they can be moved more easily to a different volume
  • to make pruning easier eventually (as only being able to choose storage with 2 GiB accuracy is a bit crude).

Hardlinking removes that second advantage, but by the time we're actually going to support pruning, I suppose many setups will not be using an upgraded pre-0.8 datadir anyway.

Regarding using the old files: currently, the block index encodes positions in files as filenum+byteoffset. I suppose using some trick like making negative filenums refer to old files, we can encode both, but the reindexing/loading code should be checked carefully... a dirty hack in any case.

Member

luke-jr commented Dec 24, 2012

I was thinking more like "if blk000?.dat exist, start new files with the next number. when reading, if the blk0000?.dat is missing, check blk000?.dat" so the files would be stored as the same filenum :)

Owner

sipa commented Dec 25, 2012

In case hardlinking fails, I don't think there's any harm in adding the found source files automatically to the list of files to be -loadblock='ed. They will be downloaded anyway otherwise, and loadblock happens in a background thread anyway, so it can only speed things up.

Contributor

gavinandresen commented Jan 8, 2013

RE: doing something smart if the hardlink fails:

I'm not planning on testing that, and I don't believe in adding untested code even if it should OBVIOUSLY work.

If somebody wants to implement and test that as a separate pull, great, but it is not on my priority list because it would benefit so few (maybe zero) people.

Contributor

gavinandresen commented Jan 9, 2013

Test plan with testing bounties:
https://github.com/gavinandresen/QA/blob/master/HardLinksUpgrade.md

First bounty claimed.

Contributor

schildbach commented Jan 9, 2013

@gavinandresen Can you explain in the "build binaries yourself" case which branch we need to build? The main "bitcoin" repository does not appear to contain 0.8 related stuff.

Owner

sipa commented Jan 9, 2013

@goonie from the branch this pull request refers to. bitcoin/bitcoin.git master does in fact contain the code that will become 0.8 (it's marked "0.7.99" now).

Contributor

schildbach commented Jan 11, 2013

The hard-linking does not appear to work on my system (Ubuntu 12.10 64bit). In two different cases, my .bitcoin directory grew from 6.3 GB to nearly 12 GB and my df (disk free) significant reduced. The files in blocks are of different size than the blk*.dat files and have different inode numbers.

The whole process tool several hours.

Owner

sipa commented Jan 11, 2013

@goonie Very strange. Which commit did you use (bitcoin reports this in the version string, see the first debug.log line printed)

Contributor

schildbach commented Jan 11, 2013

@sipa 2013-01-10 22:20:55 Bitcoin version v0.7.1-297-g429915b-beta (2013-01-06 07:26:43 -0800)

Maybe I should mention that I was using the Ubuntu package (0.7.2-quantal2) before. which is a little bit different to the official build in its dependencies.

Owner

sipa commented Jan 11, 2013

@goonie I'm afraid I confused you. You built from git head, it seems. This page is a pull request: a change requested to be merged in git head, but not yet there. The branch to pull from is in gavin's repository (https://github.com/gavinandresen/bitcoin-git.git), in branch blkfile_upgrade.

It's also included in my 'turbo' branch (https://github.com/sipa/bitcoin.git, branch turbo), together with several other pull requests.

Contributor

schildbach commented Jan 11, 2013

@sipa Ok, this time I'm using v0.7.1-269-gf4445f9-beta. It looks like its using hard links in the blocks directory. It managed to reindex about 211700 blocks in 80 minutes. Then suddenly the status bar text disappeared. From the icons on the lower right and from debug.log I can tell its still accepting blocks, but at a relatively low rate of 1 block per second (still better than before).

Contributor

schildbach commented Jan 11, 2013

@sipa Another issue: the "Show QR code" context menu option silently fails. I remember it was working before because I was using QR codes to scan with Bitcoin Wallet.

Contributor

gavinandresen commented Jan 11, 2013

@goonie : context menu breaking wouldn't be a @sipa bug-- can you open a new issue about that?

Diapolo commented Jan 11, 2013

@goonie The GUI is using an estimated block count, which it get's from the connected nodes and the last checkpoint, to be able to compute a "Blocks left" number and to display the progressbar. If you have more blocks than the estimated block count there is nothing to predict anymore and so the progressbar get's hidden.

Owner

sipa commented Jan 13, 2013

ACK

gavinandresen added a commit that referenced this pull request Jan 14, 2013

Merge pull request #2099 from gavinandresen/blkfile_upgrade
Upgrading to 0.8: re-use blkNNNN.dat files.

@gavinandresen gavinandresen merged commit dd46c88 into bitcoin:master Jan 14, 2013

laudney pushed a commit to reddcoin-project/reddcoin that referenced this pull request Mar 19, 2014

Merge pull request #2099 from gavinandresen/blkfile_upgrade
Upgrading to 0.8: re-use blkNNNN.dat files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment