Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/1805 #2707

Merged
merged 135 commits into from
Nov 14, 2022
Merged

Fix/1805 #2707

merged 135 commits into from
Nov 14, 2022

Conversation

jcnelson
Copy link
Member

@jcnelson jcnelson commented Jun 14, 2021

At long last, this PR fixes #1805.

The Problem

There are currently two related design flaws in the way the Stacks blockchain deals with PoX anchor blocks:

  • If it is ever the case in which a PoX anchor block is missing, and yet somehow manages to achieve 80% or more confirmations during the prepare phase, then the subsequent arrival of that anchor block will cause a deep chain reorg. It doesn't matter how many future blocks get mined -- if the anchor block is later revealed, it will invalidate all of the blocks that did not build on it. While mining and confirming an anchor block is very costly, it's not only possible, but profitable: anyone who manages to do this could hold the blockchain for ransom by threatening to disclose the anchor block and invaldiate all blocks after it unless they were paid not to (i.e. in perpetuity).

  • If it is ever the case that not enough STX get locked for PoX to begin in reward cycle R, then a node that processes Stacks blocks first without the anchor block in R and then with the anchor block in R will crash because it will attempt to calculate the same sortition twice. This is because the same block-commits would be processed in both cases -- they'd both be PoB commits.

This PR fixes both problems by making the history of anchor blocks itself forkable, and by implementing Nakamoto consensus on the anchor block history forks so that there will always be a canonical anchor block history. In doing so, the Stacks blockchain now has three levels of forks: the Bitcoin chain, the history of PoX anchor blocks, and the history of Stacks blocks. The canonical Stacks fork is the longest history of Stacks blocks that passes through the canonical history of anchor blocks which resides on the canonical Bitcoin chain.

Background: Sortition Histories

Recall that each Bitcoin block can contain block-commits that are valid only if certain anchor blocks are known to the node, and invalid if other anchor blocks are known. Specifically, a block-commit can be a valid PoX block-commit only if the current reward cycle has an anchor block, and that anchor block is known to the node. Otherwise, if the block-commit does not descend from the anchor block, or there is no anchor block for this reward cycle, then the block-commit can only be valid if it's a PoB block-commit.

What this means is that there is a set of sortition histories on the Bitcoin chainstate that will each yield a unique history of block-commits (which in turn represent a unique set of possible Stacks forks). This set has O(2**n) members, where n is the number of reward cycles that have anchor blocks. This is because each time a new reward cycle is processed with an anchor block, there will be a sortition history that descends from it in which the anchor block is known to the node, and a sortition history in which it is not known.

Which sortition history is the "true" sortition history, and how do we determine this? This is what this PR addresses.

Solution: Weight Sortition Histories by Miner Affirmations

Can we deduce whether or not an anchor block should exist and be known to the network, using only Bitcoin chainstate? A likely anchor block's block-commit will have at least 80 confirmations in the prepare phase -- at least F*w (i.e. 80) Bitcoin blocks will contain at least one block-commit that has the likely anchor block-commit as an ancestor.

Of course, there are competing block-commits in each Bitcoin block; only one will be chosen as the Stacks block. But, recall that in the prepare phase of a reward cycle, all miners must burn BTC. Because miners are sending BTC to the burn address, you can compare the economic worth of all block-commits within a prepare-phase block. Moreover, you can calculate how much BTC went into confirming a likely anchor block's block-commit. In doing so, we can introduce an extra criterion for selecting the anchor block in a reward cycle:

The PoX anchor block for reward cycle R is a Stacks block that has not yet been chosen to be an anchor block, and is the highest block outside R's prepare phase that has at least F*w confirmations and is confirmed by the most BTC burnt.

This is slightly different than the definition in SIP-007. We're only looking at block-commits now. If there are two or more reward-phase block-commits that got F*w confirmations, then we select the block-commit that got the most BTC. If this block-commit doesn't actually correspond to a Stacks block, then there is no anchor block for the reward cycle. Also, if this block-commit has been an anchor block before in some prior reward cycle, then there is no anchor block for this reward cycle. If Stacks miners are honest, and no Stacks miner has more than 80% of the mining power, then neither of these two cases arise -- Stacks miners will build Stacks blocks on top of blocks they know about, and their corresponding block-commits in the prepare-phase will confirm the block-commit for an anchor block the miners believe exists.

The key insight into understanding the solution to #1805 is to see that the act of choosing an anchor block is also the acts of doing the following two things:

  • Picking a likely anchor block-commit is the act of affirming that the anchor block is known to the network. A bootstrapping node does not know which Stacks blocks actually exist, since it needs to go and actually download them. But, it can examine only the Bitcoin chainstate and deduce the likely anchor block for each reward cycle. If a reward cycle has a likely anchor block-commit, then we say that the set of miners who mined that prepare-phase have affirmed to this node and all future bootstrapping nodes that they believed that this anchor block exists. I say "affirmed" because it's a weaker guarantee than "confirmed" -- the anchor block can still get lost after the miners make their affirmations.

  • Picking a likely anchor block-commit is the act of affirming all of the previous affirmations that this anchor block represents. An anchor block is a descendant of a history of prior anchor blocks, so miners affirming that it exists by sending block-commits that confirm its block-commit is also the act of miners affirming that all of the ancestor anchor blocks it confirms also exist. For example, if there are 4 reward cycles, and cycles 1, 2, and 3 have anchor blocks, then the act of miners choosing an anchor block in reward cycle 4's prepare phase that descends from the anchor block in reward cycle 3 is also the act of affirming that the anchor block for reward cycle 3 exists. If the anchor block for reward cycle 3 descends from the anchor block of reward cycle 1, but not from the anchor block in reward cycle 2, then the miners have also affirmed that the anchor block for reward cycle 1 exists. Moreover, the anchor block in reward cycle 1 has been affirmed twice -- both by the miners in reward cycle 3's prepare phase, and the miners in reward cycle 4's prepare phase. The anchor block in reward cycle 2 has not been affirmed.

The act of building anchor blocks on top of anchor blocks gives us a way to weight the corresponding sortition histories. An anchor block gets "heavier" as the number of descendant anchor blocks increases, and as the number of reward cycles without anchor blocks increases. This is because in both cases, miners are not working on an anchor block history that would invalidate this anchor block -- i.e. they are continuously affirming that this anchor block exists.

We can define the weight of a sortition history as the weight of its heaviest anchor block. If you want to produce a sortition history that is heavier, but invalidates the last N anchor blocks, you'll have to mine at least N + 1 reward cycles. This gets us a form of Nakamoto consensus for the status of anchor blocks -- the more affirmed an anchor block is, the harder it is to get it unaffirmed. By doing this, we address the first problem with PoX anchor blocks: in order to hold the chain hostage, you have to continuously mine reward cycles that confirm your missing anchor block.

Implementation: Affirmation Maps

We track this information through a data structure called an affirmation map. An affirmation map has the following methods:

  • at(i): Determine the network's affirmation status of the anchor block for the ith reward cycle, starting at reward cycle 1 (reward cycle 0 has no anchor block, ever). The domain of i is defined as the set of reward cycles known to the node, excluding 0, and evaluates to one of the following:

    • p: There is an anchor block, and it's present
    • a: There is an anchor block, and it's absent
    • n: There is no anchor block
  • weight(): This returns the maximum number of anchor blocks that descend from an anchor block this affirmation map represents

Each block-commit represents an affirmation by the miner about the state of the anchor blocks that the block-commit's Stacks block confirms. When processing block-commits, the node will calculate the affirmation map for each block-commit inductively as follows:

  • If the block-commit is in the prepare phase for reward cycle R:
    • If there is an anchor block for R:
      • If this commit descends from the anchor block, then its affirmation map is the same as the anchor block's, plus having at(R) set to p
      • Otherwise, its affirmation map the same as the anchor block's, plus having at(R) set to a
    • Otherwise:
      • If the parent descended from some anchor block at reward cycle R - k then this commit's affirmation map is the same as its parent, plus having at(R - k) set to p, plus having all at(R - k < x < R) set to n if reward cycle x doesn't have an anchor block, and a if it does.
      • Otherwise, this commit's affirmation map is defined as at(x) set to n if reward cycle x doesn't have an anchor block, and a if it does.
  • Otherwise:
    • If the parent descended from some anchor block in reward cycle R - k, then this commit's affirmation map is the same as its parent, plus having at(R - k < x < R) set to n if reward cycle x doesn't have an anchor block, and a if it does.
    • Otherwise, this commit's affirmation map is defined as at(x) set to n if reward cycle x doesn't have an anchor block, and a if it does.

Consider the example above, where we have anchor block histories 1,3,4 and 1,2.

  • A block-commit in the prepare-phase for reward cycle 4 that confirms the anchor block for reward cycle 4 would have affirmation map papp, because it affirms that the anchor blocks for reward cycles 1, 3, and 4 exist.
  • A block-commit in the prepare-phase for reward cycle 4 that does NOT confirm the anchor block for reward cycle 4, but descends from a block that descends from the anchor block in reward cycle 3, would have the affirmation map papa, because it does NOT affirm that the anchor block for reward cycle 4 exists, but it DOES affirm that the anchor block history terminating at the anchor block for reward cycle 3 exists.
  • A block-commit in the prepare-phase for reward cycle 4 that descends from a block that descends from the anchor block for reward cycle 2 would have affirmation map ppaa, because it builds on the anchor block for reward cycle 2, but it doesn't build on the anchor blocks for 3 and 4.
  • Suppose reward cycle 5 rolls around, and no anchor block is chosen at all. Then, a block in the reward phase for reward cycle 5 that builds off the anchor block in reward cycle 4 would have affirmation map pappn. Similarly, a block in reward cycle 5's reward phase that builds off of the anchor block in reward cycle 2 would have affirmation map ppaan.

(Here's a small lemma: if any affirmation map has at(R) = n for a given reward cycle R, then all affirmation maps will have at(R) == n).

Now that we have a way to measure affirmations on anchor blocks, we can use them to deduce a canonical sortition history as simply the history that represents the affirmation map with the highest weight() value. If there's a tie, then we pick the affirmation map with the highest i such that at(i) = p (i.e. a later anchor block affirmation is a stronger affirmation than an earlier one). This is always a tie-breaker, because each prepare-phase either affirms or does not affirm exactly one anchor block.

Using Affirmation Maps

Each time we finish processing a reward cycle, the burnchain processor identifies the anchor block's commit and updates the affirmation maps for the prepare-phase block-commits in the burnchain DB (now that an anchor block decision has been made). As the DB receives subsequent reward-phase block-commits, their affirmation maps are calculated using the above definition.

Each time the chains coordinator processes a burnchain block, it sees if its view of the heaviest affirmation map has changed. If so, it executes a PoX reorg like before -- it invalidates the sortitions back to the latest sortition that is represented on the now-heaviest affirmation map. Unlike before, it will re-validate any sortitions that it has processed in the past if a prefix of the now-heaviest affirmation map has been the heaviest affirmation map in the past. This can arise if there are two competing sets of miners that are fighting over two different sortition histories. In this case, it also forgets the orphaned statuses of all invalidated and re-validated Stacks blocks, so they can be downloaded and applied again to the Stacks chain state (note that a Stacks block will be applied at most once in any case -- it's just that it can be an orphan on one sortition history, but a valid and accepted block in another).

Because we take care to re-validate sortitions that have already been processed, we avoid the second design flaw in the PoX anchor block handling -- a sortition will always be processed at most once. This is further guaranteed by making sure that the consensus hash for each sortition is calculated in part from the PoX bit vector that is induced by the heaviest affirmation map. That is, the node's PoX ID is no longer calculated from the presence or absence of anchor blocks, but instead calculated from the heaviest affirmation map as follows:

  • If at(i) is p or n, then bit i is 1
  • Otherwise, bit i is 0

In addition, when a late anchor block arrives and is processed by the chains coordinator, the heaviest affirmation map is consulted to determine whether or not it should be processed. If it's not affirmed, then it is ignored.

Failure Recovery

In the event that a hidden anchor block arises, this PR includes a way to override the heaviest affirmation map for a given reward cycle. If an anchor block is missing, miners can declare it missing by updating a row in the burnchain DB that marks the anchor block as forever missing. This prevents a "short" (but still devastating) reorg whereby an anchor block is missing for almost the duration of the reward cycle -- in such a case, the absence of this declaration would cause the reward cycle's blocks to all be invalidated. Adding this declaration, and then mining an anchor block that does not affirm the missing anchor block would solve this for future bootstrapping nodes.

Still To Come

This PR implements the chain processing changes to calculate and track affirmation maps, and to handle affirmation reorgs. Still to come includes:

  • Removing the PoxSyncWatchdog, since now we can use the heaviest affirmation map to simply stall the downloader until a known-extant anchor block can be fetched
  • Downloader tests that work across PoX invalidations, in order to verify that the block header caches are properly flushed.
  • Neon integration tests to verify that nodes will correctly detect missing anchor blocks, and wait until they become available instead of simply timing out and hoping for the best.

I'm happy to jump on a (public) call and talk this over with everyone as well. This has been a long, arduous PR. As you can see from #1805, it's taken the better part of a year to come up with a solution.

Slides from the call: https://docs.google.com/presentation/d/1iXvQlVZJ30xEB25v3ILHlcKU8eXB9MqpcMUPtoISpM4/edit?usp=sharing

Recording of the call: https://us06web.zoom.us/rec/share/vWdVjQ9I_rHsqRiLyo_FBdZFJbsr33tvVl2BdajfwJRFcxxGWrxyyfTuIXfrd-cP.LltAXR2SgAv7H_Vf?startTime=1623866540000

Passcode: nHU@4ENY

…hes when calculating the next consensus hash
… conversion, and require that indexer instances be owned by the burnchain synchronization methods.
…o it can be used when deducing the PoX anchor block from burnchain state.
… history of network affirmations made on the status of prior PoX anchor blocks. Implement all of the logic required to scan a reward cycle from burnchain block commits, deduce which block-commit is the anchor block, and tag each block-commit with the affirmation map it represents. Also, add lots of unit tests for this!
…nclude all the data needed to calculate each block-commit's affirmation map. Also, add the routines necessary to deduce the affirmation map for a given reward cycle R, if the affirmation maps for all prior reward cycles up to R are already known.
…coordinator to search for multiple histories of sortitions with differnent PoX IDs, and revalidate previously-inavlidated sortitions should the canonical PoX ID change.
…rgotten, so a previously-unprocessable Stacks block can become processable again (i.e. in the event of a PoX reorg). In addition, remove a race condition in the invalid-block-deletion logic by moving the block file *and then* truncating it.
…anchor blocks. The canonical Stacks fork must pass through the longest history of anchor blocks (by number of anchor blocks and empty reward cycles). Use anchor block affirmation maps to identify and track the heaviest anchor block history, and if the heaviest affirmation map changes, invalidate sortitions and reprocess them, but this time, use the new heaviest affirmation map to deduce which anchor blocks *must exist*. This not only makes it possible to reorg the Stacks blockchain if the network loses an anchor block, but also makes the act of re-affirming an anchor block N reward cycles ago *at least as hard as* mining N+1 new reward cycles.
…ry forks. Make it so the tests confirm that two anchor block history forks can "take turns" being the canonical fork, ensuring that the chains coordinator correctly reprocesses and even revalidates previously-invalid sortitions and their Stacks blocks.
…epochs for Stacks 2.1, and remove some dead code
@project-bot project-bot bot added this to Review in progress in Stacks Blockchain Board Jun 14, 2021
…xist, and have the test framework actually go and connect to the DB (instead of trying to open an existing DB)
… of has_stored_block() to include testing the presence of the block's processed bit in the staging_blocks table (and checking against the headers DB to see if a block has truly been added to the chainstate when considering a new block)
Copy link
Contributor

@gregorycoppola gregorycoppola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on the high-level concept of this PR. it is an elegant solution to a significant problem.

LGTM on the overall code style and lower level implementation stuff.

really like the documentation.

but.. what is the status of the testing? has that been added? if so, which are the key tests, and, if not, when do the tests get added?

@@ -1482,6 +1481,7 @@ fn bitcoind_forking_test() {

// Let's create another fork, deeper
let burn_header_hash_to_fork = btc_regtest_controller.get_block_hash(206);
eprintln!("Instigate 10-block deep fork");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this eprint?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes it easier to see in logs.

Copy link
Contributor

@gregorycoppola gregorycoppola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming it's tested, I approve.

/// There were two related design flaws in the way the Stacks blockchain deals with PoX anchor blocks:
///
/// * If it is ever the case in which a PoX anchor block is missing, and yet somehow manages to achieve 80% or more
/// confirmations during the prepare phase, then the subsequent arrival of that anchor block will cause a _deep_ chain
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it the case that, when the missing PoX archor block arrives, the chain would reorg its canonical fork to include the missing anchor block? Prior to the arrival of the missing PoX anchor block, what would be the canonical fork?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, if the late PoX anchor block arrives (no matter how late), the chain would indeed reorg to include it.

If the PoX anchor block is missing, then miner resort to PoB mining, and mine off of the highest chain tip they know about (i.e. an ancestor of the missing PoX anchor block).

/// * If it is ever the case in which a PoX anchor block is missing, and yet somehow manages to achieve 80% or more
/// confirmations during the prepare phase, then the subsequent arrival of that anchor block will cause a _deep_ chain
/// reorg. It doesn't matter how many future blocks get mined -- if the anchor block is later revealed, it will
/// invalidate all of the blocks that did not build on it. While mining and confirming an anchor block is very costly,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to clarify difference between anchor block and PoX anchor block.

While mining and confirming a PoX anchor block

///
/// This is slightly different than the definition in SIP-007. We're only looking at block-commits now. If there are
/// two or more reward-phase block-commits that got F*w confirmations, then we select the block-commit that got the most
/// BTC. If this block-commit doesn't actually correspond to a Stacks block, then there is no anchor block for the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could there exist a tie between two block-commits with same BTC?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes; if this happens, the one that's higher in the chain wins.

Stacks Blockchain Board automation moved this from Review in progress to Reviewer approved Nov 14, 2022
Stacks 2.1 automation moved this from Review in progress to Reviewer approved Nov 14, 2022
@jcnelson
Copy link
Member Author

Finally 🎉

@jcnelson jcnelson merged commit e72c264 into next Nov 14, 2022
Stacks Blockchain Board automation moved this from Reviewer approved to Done Nov 14, 2022
Stacks 2.1 automation moved this from Reviewer approved to Done Nov 14, 2022
@jcnelson jcnelson mentioned this pull request Nov 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

None yet

5 participants