Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bitcoin: (bug) loadblock/bootstrap.dat will not read file larger than 2.0GB #1951

Closed
qubez opened this issue Oct 23, 2012 · 13 comments
Closed
Labels

Comments

@qubez
Copy link

qubez commented Oct 23, 2012

Issue: New bitcoin install imports only 2.0GB of blockchain bootstrap.dat (height=189205) before continuing startup.

Platform: Windows 7 SP1 x64
Client: Bitcoin-0.7.1-Win32

Steps to replicate:
Obtain bootstrap.dat torrent (SHA256 a3f258e7af...) (block height 193000, 2.32 GB)

Command line used:
bitcoind.exe -datadir=C:\datadir -loadblock=C:\bootstrap.dat -connect=127.0.0.1 -detachdb -printtoconsole

Wait 5000 seconds or so, only blocks up to 189205 are processed before bitcoin continues normal operation (giving expected "no RPC password" error if no bitcoin.cfg file present).

https://bitcointalk.org/index.php?topic=117982.msg1292042#msg1292042

@Diapolo
Copy link

Diapolo commented Oct 23, 2012

Seems logical, as before the blk000x.dat files had a hard-coded limit of < 2GiB on Windows. I'm sure @jgarzik or @sipa can clarify this. Are you using NTFS or FAT32 as filesystem?

@qubez
Copy link
Author

qubez commented Oct 24, 2012

NTFS and EXT4. Linux PPA build 0.7.0 exhibits the same behaviour.

@laanwj
Copy link
Member

laanwj commented Oct 24, 2012

Don't blame the file system, modern filesystems can handle huge files (how else would you store your BluRay images 😏)
It's probably some trivial issue using int or unsigned int as file pointer with fseek instead of off_t.

Looking at CDiskBlockPos in main.h:

...
unsigned int nPos
....

Also line 2502 in main.cpp https://github.com/bitcoin/bitcoin/blob/master/src/main.cpp#L2502.
And in util.h:

void AllocateFileRange(FILE *file, unsigned int offset, unsigned int length) 

@sipa
Copy link
Member

sipa commented Oct 24, 2012

I suppose a CReadBuffer that wraps around CAutoFile or other reader classes, and has a method for skipping input until a fixed string is found, would be a very neat solution that doesn't require any seeking at all.

Also, fixing the byte offsets to use off_t instead of ints in the code would certainly be an improvement, but at least in the current flow, AllocateFileRange should never be called with offset+length > 128 MiB.

@Diapolo
Copy link

Diapolo commented Oct 24, 2012

Is off_t a 64 bit unsigned integer? That would allow quite bigfiles ^^.

We need to fix this even with ultraprune, because of the bootstrap.dat file mentioned.

@sipa
Copy link
Member

sipa commented Oct 24, 2012

off_t is whatever the system supports for offsets, but it's not entirely standardized (there's also a off64_t sometimes, with corresponding lseek64 function, defeating the purpose of the origin off_t somewhat).

What I'm saying is that off_t would certainly be an improvement over what we have now, it'd even be better not to need to seek at all, which is certainly possible in LoadExternalBlockFile.

@jgarzik
Copy link
Contributor

jgarzik commented Oct 24, 2012

As noted in IRC, the specific problem is that LoadExternalBlockFile() calls fseek(), whose file offset is limited to signed 32-bit (long) on 32-bit platforms. This impacts Windows, Linux and presumably OSX as well.

  1. LoadExternalBlockFile() should be updated to avoid seeking. Most likely fread() will continue to work beyond even 4GB boundary, if we read linearly and accumulate the file position ourselves.

  2. Most of the code uses 32 bits for file position, which is highly disappointing. At a minimum, we should make sure that external serialized storage in databases like ultraprune record 64-bit file positions.

@luke-jr
Copy link
Member

luke-jr commented Oct 24, 2012

I wonder if SEEK_CUR would work?

@jgarzik
Copy link
Contributor

jgarzik commented Oct 24, 2012

SEEK_CUR would probably work, but why chance it? A simple linear read works fine too.

@qubez
Copy link
Author

qubez commented Oct 24, 2012

A different solution would be to have blockchain-generating scripts create transaction block files less than 2GB in size, perhaps named bootstrap.001, bootstrap.002, etc, and have Bitcoin look for and import these sequentially instead. One must be concerned about another file - blkindex.dat, it is 1.1GB and must be replaced in all clients with the ultraprune leveldb and/or reviewed for big file support before it approaches 2GB.

@laanwj
Copy link
Member

laanwj commented Oct 25, 2012

off_t is 64 bit if -D_FILE_OFFSET_BITS=64 is defined. I've just verified this with mingw and linux (someone needs to verify on OSX).

Another problem is that we use fseek takes a long for the offset. This depends on the architecture. We could instead use fseeko which takes an off_t (and ftello which returns one).

@qubez: blkindex.dat is not affected, it is a berkelydb database, which has no problems with large files

@Diapolo
Copy link

Diapolo commented Feb 11, 2013

@jgarzik
Copy link
Contributor

jgarzik commented Feb 11, 2013

Yes, it is solved

@jgarzik jgarzik closed this as completed Feb 11, 2013
KolbyML pushed a commit to KolbyML/bitcoin that referenced this issue Dec 5, 2020
…d initialization

f94fbe5 CWalletTx: missing fStakeDelegationVoided initialization (furszy)

Pull request description:

  Missing initialization for `fStakeDelegationVoided` flag in `CWalletTx`.

ACKs for top commit:
  random-zebra:
    utACK f94fbe5

Tree-SHA512: 282381c0ae8459c42860153c84447ebac83b4ae65e298effd2ba2b7d49c1f1483de0bdd8796678de7a00a0b8376d9d2c79002781fd00243b89356f348f70e942
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

6 participants