Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 2.05.0.2.1 (hotfix: check chain work when processing reorgs) #3152

Merged
merged 18 commits into from Jun 3, 2022

Conversation

jcnelson
Copy link
Member

@jcnelson jcnelson commented May 26, 2022

Okay, this is now ready for review.

This is a hotfix to the SPV client that addresses the following bugs:

  • The SPV client now tracks the best chain work for each Bitcoin difficulty adjustment interval it has ever seen. Any attempt to store headers that would lead to a decrease in total chain work will result in an error which will cause the node to re-attempt to download those headers indefinitely until they have at least the best-seen work.

  • The SPV client will verify that newly-obtained headers after the initial header sync have timestamps that are within 2 hours of the current epoch time.

  • The SPV client will verify that each Bitcoin header has a timestamp that is strictly greater than the median of the past 11 Bitcoin blocks.

  • The SPV client will reject a block that has a strictly higher value than the target for the difficulty interval, instead of an equal or higher target.

By far the most work in this PR concerns point #1. To build confidence in the implementation, I added a unit test that is disabled by default, but if set up with the proper environment variables, will download the mainnet Bitcoin block headers and verify that the calculated chain work in each interval is equal to values I manually extracted from bitcoind. In addition, I took PoC bad block headers from the Immunefi bug report and created two unit tests whereby:

  • A new chain with the bad blocks is discovered and compared to the main chain with the good blocks (this new chain is verified ignored because it has lower total chain work), and

  • A new chain with the good blocks is discovered and compared to the main chain with the bad blocks (this new chain is verified accepted because it has a higher total chain work).

There are a couple of modifications to BitcoinIndexer::find_bitcoin_reorg() that necessitated changes to Burnchain::sync_with_indexer() that are difficult to test in both unit and integration tests. I am currently in the process of filling this end-to-end test checklist:

  • A node with this patch is able to resume from existing mainnet chainstate and reach the chain tip

  • A node with this patch is able to boot from genesis on mainnet and reach the chain tip

  • A node with this patch is able to resume from existing testnet chainstate and reach the chain tip

  • A node with this patch is able to boot from genesis on testnet and reach the chain tip

Once these end-to-end tests pass, I think we can make a hotfix release. Once the first end-to-end test passes (resuming from existing mainnet chainstate), I think we can cut a release candidate.


Reminder to anyone reading this: This project participates in Immunefi to pay bounties for bugs such as this one. This bug was reported via Immunefi and is confirmed to exist, as this PR proves.

…ndexer's find_bitcoin_reorg() method to check the original and reorg chain's total work in order to decide whether to move forward with reorg processing.
…BUT, right now the work calculation for a difficulty adjustment interval is getting the wrong answers and I don't know why.
…et, which is what gets used in the difficulty comparison
@jcnelson jcnelson requested a review from kantai May 26, 2022 04:13
@codecov
Copy link

codecov bot commented May 26, 2022

Codecov Report

Merging #3152 (762e007) into master (4641001) will decrease coverage by 1.32%.
The diff coverage is 19.03%.

@@            Coverage Diff             @@
##           master    #3152      +/-   ##
==========================================
- Coverage   83.73%   82.40%   -1.33%     
==========================================
  Files         260      260              
  Lines      201055   203317    +2262     
==========================================
- Hits       168355   167545     -810     
- Misses      32700    35772    +3072     
Impacted Files Coverage Δ
src/burnchains/bitcoin/mod.rs 38.02% <0.00%> (-1.11%) ⬇️
src/burnchains/bitcoin/indexer.rs 33.28% <11.48%> (-50.93%) ⬇️
src/burnchains/bitcoin/spv.rs 73.91% <55.68%> (-3.23%) ⬇️
...-common/src/deps_common/bitcoin/blockdata/block.rs 86.50% <85.71%> (+9.52%) ⬆️
src/burnchains/burnchain.rs 89.87% <87.50%> (-0.26%) ⬇️
stacks-common/src/util/uint.rs 90.26% <93.75%> (+2.44%) ⬆️
src/chainstate/burn/db/sortdb.rs 95.29% <100.00%> (-0.04%) ⬇️
src/net/atlas/download.rs 44.80% <0.00%> (-37.58%) ⬇️
clarity/src/vm/events.rs 52.26% <0.00%> (-17.09%) ⬇️
src/net/atlas/db.rs 82.45% <0.00%> (-9.36%) ⬇️
... and 46 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 26bfd5f...762e007. Read the comment docs.

@jcnelson jcnelson marked this pull request as ready for review May 27, 2022 01:47
Comment on lines 150 to 153
let bytes = self.0[i].to_le_bytes();
for j in 0..bytes.len() {
ret[$n_words * 8 - 1 - (i * 8 + j)] = bytes[j];
}
Copy link
Member

@kantai kantai May 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think using to_be_bytes() makes this a little easier to follow

Suggested change
let bytes = self.0[i].to_le_bytes();
for j in 0..bytes.len() {
ret[$n_words * 8 - 1 - (i * 8 + j)] = bytes[j];
}
let word_end = $n_words * 8 - (i * 8);
let word_start = word_end - 8;
ret[word_start..word_end]
.copy_from_slice(&self.0[i].to_be_bytes());

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

@@ -671,4 +734,21 @@ mod tests {
Uint256([0, 0xDEADBEEFDEADBEEF, 0xDEADBEEFDEADBEEF, 0])
);
}

#[test]
pub fn hex_codec() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes in uint need one or two more complex vectors -- this test only covers handling a single word of content.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

// * needs the last difficulty interval of headers (note that the current
// interval is `start_block / BLOCK_DIFFICULTY_CHUNK_SIZE - 1).
// * needs the last interval's chain work calculation
let interval_start_block = start_block / BLOCK_DIFFICULTY_CHUNK_SIZE - 2;
Copy link
Member

@kantai kantai May 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is interval_start_block? If BLOCK_DIFFICULTY_CHUNK_SIZE is 500, and start_block is 502, this is -2? Should that panic or underflow?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a saturating subtraction.

let reorg_total_work = reorg_spv_client.update_chain_work()?;
let orig_total_work = orig_spv_client.get_chain_work()?;

debug!("Bitcoin headers history is consistent up to {}. Orig chainwork: {}, reorg chainwork: {}", new_tip, orig_total_work, reorg_total_work);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good case for k-v syntax

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will fix

Comment on lines +713 to +716
info!(
"New canonical Bitcoin chain found! New tip is {}",
&hdr_reorg.header.bitcoin_hash()
);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this log line appear on every new bitcoin block, or only on reorgs?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only on reorgs. This can only get hit if the newly-discovered headers (the "reorg" headers) have more total work than the headers currently in the headers DB (the "orig" headers).

Copy link
Member

@kantai kantai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks generally good to me -- I had one question about some of the potentially underflowing math in bitcoin::indexer which I think needs to be addressed.

On top of that, I think this needs to update the CHANGELOG.md with a new release section.

Copy link
Contributor

@gregorycoppola gregorycoppola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jcnelson
Copy link
Member Author

@gregorycoppola I'm not seeing what you're seeing?

…over to the .reorg DB, and use K/V logging. Also, add multi-word test vectors to hex codec for uint256
@gregorycoppola
Copy link
Contributor

@jcnelson

I think the linking to CodeCov system is kind of buggy/confusing, because when I click the link it doesn't bring me to the line I started with.

I'm guessing you mean you don't know which lines I'm referring to (but maybe you mean you see the lines but you think they are actually covered).

I think the D3 in the link revers to the file src/burnchains/bitcoin/spv.rs, because it's the third file on the diff list.

I'm attaching some screen shots.
image
image
image

@jcnelson
Copy link
Member Author

@gregorycoppola Oh I see why. The new unit tests require you to supply a mainnet headers DB, whose path is indicated by the enviorn BLOCKSTACK_SPV_HEADERS_DB. I'm not sure how to go about adding this to the CI test suite, but in the interest of getting this hotfix released, I'm inclined to address it in a later PR. You can manually run these tests yourself; I have done so many times. @kantai thoughts?

@gregorycoppola
Copy link
Contributor

@jcnelson i'm not trying to create busy work in the middle of a hot fix.. just checking what was going on.

if you think it's tested, that's fine.

@jcnelson
Copy link
Member Author

@jcnelson i'm not trying to create busy work in the middle of a hot fix.. just checking what was going on.

Ah, sorry, I didn't realize how dismissive I sounded in that comment! I'm glad you brought the code coverage up -- it's important that we do get this covered in CI. I was only trying to get the reviewers' permission to do so in a separate PR :)

Copy link
Contributor

@gregorycoppola gregorycoppola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not totally getting the higher-level picture of what exactly the bug was, why it was a bug, and why this fix is needed.

But, it seems to me like the code is implementing the change described in the description... so I'll leave it to you guys to verify that the change makes sense.

@jcnelson jcnelson changed the title hotfix: check chain work when processing reorgs Release 2.05.0.2.1 (hotfix: check chain work when processing reorgs) Jun 3, 2022
@jcnelson
Copy link
Member Author

jcnelson commented Jun 3, 2022

Okay everyone, both the Foundation and Hiro have confirmed that this node is able to both boot from existing chainstate and spawn from genesis on both mainnet and testnet. If that's sufficient in your eyes for cutting a release, please approve this.

Copy link
Member

@kantai kantai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@jcnelson jcnelson merged commit 38aa968 into master Jun 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants