Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic header support #140

Merged
merged 3 commits into from
Mar 15, 2017

Conversation

erasmospunk
Copy link
Contributor

Block headers can have a dynamic size that is being indexed on a
new meta file "headers_offsets".

The offsets are 64bit in order to accommodate coins with big
headers that will accumulate GBs of header data after some years.
We don't store the offset of the genesis header as it's always 0,
this means that the offset for height 1 is stored at position 0,
for height 2 at 8 and so on.

This commit adds support for AuxPow coins like Namecoin, Dogecoin
and Equihash coins like Zcash

@coveralls
Copy link

coveralls commented Feb 24, 2017

Coverage Status

Coverage increased (+13.3%) to 16.921% when pulling 5cb58074914c5816464c683787dad6a865b50e71 on erasmospunk:dynamic-headers into 0122336 on kyuupichan:master.

@coveralls
Copy link

coveralls commented Feb 24, 2017

Coverage Status

Coverage increased (+13.3%) to 16.916% when pulling 60459ced19630e9170098ebf6178a16579c88584 on erasmospunk:dynamic-headers into 0122336 on kyuupichan:master.

@erasmospunk
Copy link
Contributor Author

Small fix

@coveralls
Copy link

coveralls commented Feb 24, 2017

Coverage Status

Coverage increased (+13.3%) to 16.916% when pulling 5027abb3921e3591dbb73158c74d5c609e6baced on erasmospunk:dynamic-headers into 0122336 on kyuupichan:master.

@erasmospunk
Copy link
Contributor Author

Small refactor

@coveralls
Copy link

coveralls commented Feb 24, 2017

Coverage Status

Coverage increased (+13.3%) to 16.921% when pulling 3ee04d09a6c62c01f315542d1e0f98941f173853 on erasmospunk:dynamic-headers into 0122336 on kyuupichan:master.

@erasmospunk
Copy link
Contributor Author

Some clean ups

@coveralls
Copy link

coveralls commented Feb 24, 2017

Coverage Status

Coverage increased (+13.3%) to 16.929% when pulling 972e16da1d402bd63df2f26381f60d7ea5f53484 on erasmospunk:dynamic-headers into 0122336 on kyuupichan:master.

@erasmospunk
Copy link
Contributor Author

Fixed edge case

@coveralls
Copy link

coveralls commented Feb 24, 2017

Coverage Status

Coverage increased (+13.3%) to 16.908% when pulling 615bf165e0b30b4dfcdcfbe24e537e0e68084c13 on erasmospunk:dynamic-headers into 0122336 on kyuupichan:master.

@kyuupichan
Copy link
Owner

John - I'll be leaving these changes for post 1.0. They are best reviewed if you can break them up into perhaps 3 or 4 parts. Let's just do 1 part at a time so you don't have to keep updating later ones if earlier ones change.

@coveralls
Copy link

coveralls commented Feb 26, 2017

Coverage Status

Coverage increased (+13.2%) to 16.84% when pulling 0ca9c099295c13c216ca54fe6d64b9eb69aa7c8d on erasmospunk:dynamic-headers into b3005fb on kyuupichan:master.

@erasmospunk
Copy link
Contributor Author

@kyuupichan just pushed the changes in smaller commits.

Please notify me when you are ready to merge so I can rebase to the latest master if needed.

@erasmospunk
Copy link
Contributor Author

Rebased to the latest master

@coveralls
Copy link

coveralls commented Feb 26, 2017

Coverage Status

Coverage increased (+13.2%) to 16.84% when pulling 2b670702127957db842d2048050284cc47295970 on erasmospunk:dynamic-headers into b3005fb on kyuupichan:master.

@erasmospunk
Copy link
Contributor Author

Removed commit #141 from this pull request

@coveralls
Copy link

coveralls commented Feb 26, 2017

Coverage Status

Coverage increased (+13.2%) to 16.84% when pulling 4530e216cef26da607c35c52dacacf0f7d39dcc1 on erasmospunk:dynamic-headers into b3005fb on kyuupichan:master.

@coveralls
Copy link

coveralls commented Feb 26, 2017

Coverage Status

Coverage increased (+13.2%) to 16.802% when pulling bdda44548e6a0c8968c4bdc58aa7256cac1a0292 on erasmospunk:dynamic-headers into 472c738 on kyuupichan:master.

Copy link
Owner

@kyuupichan kyuupichan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. In future please omit the 0 in [0:x].

@kyuupichan
Copy link
Owner

I'll try and take a look at this stuff this weekend. Regarding your Tx question, not looked in detail but would probably prefer the tx be a separate class like you have it, perhaps derived from the existing one.

@coveralls
Copy link

coveralls commented Mar 9, 2017

Coverage Status

Coverage increased (+2.9%) to 18.649% when pulling f0cbdb69bb257781b108764bf625892dcab71708 on erasmospunk:dynamic-headers into 30bb832 on kyuupichan:master.

@coveralls
Copy link

coveralls commented Mar 9, 2017

Coverage Status

Coverage increased (+2.9%) to 18.654% when pulling cf2de9b0e2edaa1bfa2bd3faa7db975984f996ca on erasmospunk:dynamic-headers into 30bb832 on kyuupichan:master.

@kyuupichan
Copy link
Owner

kyuupichan commented Mar 11, 2017

John, sorry this is just a bit too much of a large change all at once. I really appreciate the effort you've put into this and the tests, but I need it broken up a bit. A few other things:

  • Can we do the API changes first? Then maybe one or more changes, and finally the new coin definitions and tests.
  • I know the definition of block will change with the named tuple, but I'd like to keep calling it a block. So let's just stick with block and blocks rather than b, parsed_blocks, full_block etc.
  • I'd like to avoid the Deserializer construction you've done. I had to stare at it a while (and I learnt something new!). I don't want tx.py objects having a COIN member for now. The old classes won't need one; the methods of the new deserializer classes can be passed "cls" to a coin parameter if necessary when invoked from coins.py. But looking at your changes it seems all they really want is a basic_header_size to be passed.
  • perhaps coin.DESERIALIZER should be lower case and just be the function in tx.py for bitcoin and similar, and be a function in the coin class for aux_pow coins
  • Do we actually really need a read_header API in tx.py? Can we keep it as-is with read_tx and read_block (perhaps to be renamed read_tx_block)? I'd like to keep tx.py to be just about transactions; I think the knowledge of the block format and header size better belongs in the coin class.
  • in db.py I'd like to call fs_update_header_offsets unconditionally, and return doing nothing if STATIC_BLOCK_HEADERS, rather than testing call sites and asserting inside the function.

I realize this might be a bit of work. If you prefer I can have a go at doing the initial API changes, but I'm not sure I'll get everything. Then you can follow up with the bulk of the change. Or you can do the lot. But if you could break it up into at least 3 pieces it would help. One at a time is fine.

I've created a dynamic-headers branch on github. Can you work against that? Let me know how you'd like to proceed.

@erasmospunk
Copy link
Contributor Author

@kyuupichan thanks for the quality code review.

Can we do the API changes first? Then maybe one or more changes, and finally the new coin definitions and tests.

Good point. Regarding the coin definitions, what do you think about a folder that contains a .py file per coin? This should keep clean the coins.py

Regarding the other points I will take a look in the next days.

@kyuupichan
Copy link
Owner

Let's keep it all in coins.py for now. Later something might be more appropriate,indeed. One advantage to having it all in coins.py is selecting the coin class is easy programatically (the lookup_ function); it might be more awkward if in separate files.

@erasmospunk
Copy link
Contributor Author

@kyuupichan one thing that I noticed, when parsing the block it is useful to have the Deserializer::cursor property, especially for parsing a block footer if it exists.

@erasmospunk
Copy link
Contributor Author

Although technically we don't need the footer...

@erasmospunk erasmospunk changed the base branch from master to dynamic-headers March 14, 2017 01:26
@coveralls
Copy link

coveralls commented Mar 14, 2017

Coverage Status

Coverage increased (+2.9%) to 18.65% when pulling 77669871e60f8d91ba09c599a385f0730393eee4 on erasmospunk:dynamic-headers into 9c6d2f5 on kyuupichan:dynamic-headers.

@erasmospunk
Copy link
Contributor Author

Refactored code but haven't tried it yet, please review but don't merge.

One thing that I would like to refactor is to expose the Deserializer private methods like _read_varint and _read_nbytes that are essential to parse the dynamic headers.

Another difference is that we don't read the block footers anymore.

lib/coins.py Outdated
return deserializer(block[cls.header_len(height):]).read_block()
if height > 0:
return cls.block_full(block, height)
else:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an infinite loop. I'll try and fix it here.

@@ -566,14 +567,14 @@ def backup_blocks(self, blocks):
coin = self.coin
for block in blocks:
# Check and update self.tip
header = coin.block_header(block, self.height)
header, txs = coin.block_full(block, self.height)
header_hash = coin.header_hash(header)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I prefer to keep the namedtuple as a single object and refer to its members, rather than assuming it has length 2. I'll update that too, here.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm merging your first commit of the 4 and following up with a couple of tweaks as mentioned above. If you could rebase that would be great. But before doing so let me make a couple of comments on the 2nd commit too.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh and that code was good enough to survive a forced testnet 3 block reorg via RPC

server/db.py Outdated
# We don't store the offset of the genesis header as it's always 0
# so this means that the offset for height 1 is stored at
# position 0, for height 2 at 8 and so on.
if height > 0:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather waste 8 bytes on disk and not have the extra comment and if statement. Any reason not to store height 0 as 0 too? How is it not getting written? I might be missing something.

And can you add yourself to AUTHORS? This 2nd commit looks good otherwise.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, block 0 offset is a special case to "bootstrap" the first offset when the headers_offsets_file is initially empty, otherwise dynamic_header_offset will not work.

Check how the fs_update_header_offsets calculates the offset of block N based on the header size of block N-1.

One pythonic "fix" could be:

try:
    pos = (height - 1) * 8
    offset, = unpack('<Q', self.headers_offsets_file.read(pos, 8))
    return offset
except struct.error as e:
    if height == 0:
        return 0
    else:
        raise Exception('Header offset not found: {}'.format(height))

Another alternative is to check and write the block 0 offset when creating the headers_offsets_file.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, do you mind writing the initial 0? There is a similar issue with tx_count, and I don't like my current solution, it caused too many if statements.

lib/coins.py Outdated
header_end = cls.BASIC_HEADER_SIZE
block.cursor = start
return block._read_nbytes(header_end)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think the bulk of the block_header method here is best put in an aux-pow specific deserializer? I think that's what you did before and I think I prefer it now it's clearer what's happening. Sorry about that. Similarly for Zcash below.

So from here just call AuxPowDeserializer.block_header(cls.BASIC_HEADER_SIZE, cls.VERSION_AUX_POW) or something like that. Those 2 variables belong in the coin class as they are now. With that I think I'm OK with everything once you rebase. I'll merge it. Thanks.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thing, please put in a commit message the github text "Closes #xxx" for the 2 outstanding issues related to this aux pow support. I assume it does resolve them.

@coveralls
Copy link

Coverage Status

Coverage increased (+2.9%) to 18.753% when pulling e33466876024f796e89a84acbea8391b43f582f5 on erasmospunk:dynamic-headers into 0fdd23a on kyuupichan:dynamic-headers.

1 similar comment
@coveralls
Copy link

coveralls commented Mar 14, 2017

Coverage Status

Coverage increased (+2.9%) to 18.753% when pulling e33466876024f796e89a84acbea8391b43f582f5 on erasmospunk:dynamic-headers into 0fdd23a on kyuupichan:dynamic-headers.

Block headers can have a dynamic size that is being indexed on a
new meta file "headers_offsets".

The offsets are 64 bits in order to accommodate coins with big
headers that will accumulate GBs of header data after some years.

Closes kyuupichan#128
@coveralls
Copy link

coveralls commented Mar 15, 2017

Coverage Status

Coverage increased (+2.9%) to 18.753% when pulling cb9ab7e on erasmospunk:dynamic-headers into 0fdd23a on kyuupichan:dynamic-headers.

@erasmospunk
Copy link
Contributor Author

Updated with the changes you asked. Also I made some tests with Dogecoin, Zcash and it seems to work without an issue like before.

@kyuupichan
Copy link
Owner

Thanks John. I'll merge this as it seems you're happy with non-BTC stuff. I'll run this on my server for a while, and if it's good I'll merge it into master.

@kyuupichan kyuupichan merged commit cd990c4 into kyuupichan:dynamic-headers Mar 15, 2017
@erasmospunk erasmospunk deleted the dynamic-headers branch March 21, 2018 17:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants