Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(block, slot) fork choice #2197

Closed
wants to merge 25 commits into from
Closed

(block, slot) fork choice #2197

wants to merge 25 commits into from

Conversation

adiasg
Copy link
Contributor

@adiasg adiasg commented Feb 10, 2021

This PR proposes a fix to a fork-choice attack based on the LMD block scoring algorithm. The attack was disclosed in this paper, where a detailed description can be found.

A brief depiction of the attack from the paper:
image

The solution proposed in this PR disallows the attack by modifying how the fork choice takes into account attestations for empty slots. For example, in the above figure, the proposed fork choice rule should choose an empty block at slot n+1 rather than the attacker's private block, because more attestations support the former path in the block tree.

This is implemented by applying the fork choice on a tree of (block, slot)-nodes rather than the standard block tree. For example, when receiving blocks & attestations from the above figure, an Eth2 client would prepare the following (block, slot)-tree:
image

specs/phase0/fork-choice.md Outdated Show resolved Hide resolved
specs/phase0/fork-choice.md Outdated Show resolved Hide resolved
specs/phase0/fork-choice.md Outdated Show resolved Hide resolved
specs/phase0/fork-choice.md Outdated Show resolved Hide resolved
Copy link
Contributor

@djrtwo djrtwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a very fast first pass!

will review more deeply tomorrow morning

# If leaf block, check finalized/justified checkpoints as matching latest.
head_state = store.block_states[block_root]
# If leaf node, check finalized/justified checkpoints as matching latest.
head_state = store.block_states[node.block_root]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not need the state at the (block, slot) combo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn’t appear so. The state is being used in the fork choice:

  1. in filter_block_slot_tree(), to check if a particular (block, slot)-node has the correct justified/finalized checkpoint — we can get away with using states from only the “real” blocks here
  2. in store_target_checkpoint_state(), to store the target checkpoint state — this already accounts for empty slots

beacon_block_root = attestation.data.beacon_block_root
for i in attesting_indices:
if i not in store.latest_messages or target.epoch > store.latest_messages[i].epoch:
store.latest_messages[i] = LatestMessage(epoch=target.epoch, root=beacon_block_root)
if i not in store.latest_messages or attestation_slot > store.latest_messages[i].slot:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to be careful. A validator shouldn't be able to update their epoch vote if it's same epoch but later slot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We check is_valid_indexed_attestation() (in on_attestation()) before this piece of code. This prevents the above case, since validators are assigned a single slot to attest for in each epoch.

specs/phase0/fork-choice.md Outdated Show resolved Hide resolved
specs/phase0/fork-choice.md Show resolved Hide resolved
specs/phase0/fork-choice.md Outdated Show resolved Hide resolved
specs/phase0/fork-choice.md Outdated Show resolved Hide resolved
@adiasg adiasg added scope:fork-choice scope:security General protocol security-related items labels Feb 11, 2021
@mcdee
Copy link
Contributor

mcdee commented Feb 12, 2021

Is there any concern here that blocks can be censored? If a block at (block n, n) is published but enough attestations are made for (block n-1, n) then the block at n will be orphaned. For a large enough validator or set of validators this appears to give them censorship-level control of the chain.

And for smaller sets of validators does this make always voting for an empty slot optimal? It seems that in a world where everyone loosely agrees to vote for an empty slot the validators will be pretty much guaranteed income, whereas in a world where everyone agrees to vote for the latest head introduces the uncertainty we see today, where votes are split between those who see the block for the current slot and those that do not, hence income is not guaranteed.

@mkalinin
Copy link
Collaborator

And for smaller sets of validators does this make always voting for an empty slot optimal?

Won't they vote for a block rather than empty slot because their attestations might be carried by that block?

@mcdee
Copy link
Contributor

mcdee commented Feb 12, 2021

And for smaller sets of validators does this make always voting for an empty slot optimal?

Won't they vote for a block rather than empty slot because their attestations might be carried by that block?

Thing is, a correct head vote that's delayed by a slot or two is worth a lot more than an incorrect head vote that goes in the next slot. So if the set of validators is not 100% it seems that they'd be better off voting for the empty slot on the basis that they will often be correct (but not included immediately) but sometimes incorrect (but then the previous attestations are included so they are net positive).

It may come down to the % of the validators doing this, though; I haven't run the maths.

@lsankar4033
Copy link
Contributor

lsankar4033 commented Feb 12, 2021

Thing is, a correct head vote that's delayed by a slot or two is worth a lot more than an incorrect head vote that goes in the next slot. So if the set of validators is not 100% it seems that they'd be better off voting for the empty slot on the basis that they will often be correct (but not included immediately) but sometimes incorrect (but then the previous attestations are included so they are net positive).

Assuming no coordination and the option for a validator to choose b/w {(block n, n), (block n-1, n)}, you'd expect them to attest to whichever they think most others will attest to. Arguably, if they see block n in a timely manner (i.e. not at the end of the slot), they'd probably assume everyone else sees that block and thus vote (block n, n).

Which is to say that as long as network propagation is efficient enough, I don't think this is an issue.

@adiasg
Copy link
Contributor Author

adiasg commented Feb 13, 2021

@mcdee:

And for smaller sets of validators does this make always voting for an empty slot optimal?

A rational validator should make an attestation for a block or (block, slot) that they think is going to be included in the chain.
The choices that a validator sees in the current spec are between voting for a block from a past slot, or a block at the current slot. The rational action in this case is to vote for some block from the past slot that has more supporting weight than block at the current slot.
The choices with this PR are to vote for an empty block at the current slot, or current block at the current slot. I'm not sure why the rational action in this case would be that:

everyone loosely agrees to vote for an empty slot

For reasons mentioned in @lsankar4033's above comment, I think that voting for the current block at the current slot would be equilibrium action when all validators think that:

  1. the network is synchronous enough that everyone receives blocks in time, and
  2. more than half of the validators are honest and would vote for the current block.

@mcdee:

Is there any concern here that blocks can be censored? If a block at (block n, n) is published but enough attestations are made for (block n-1, n) then the block at n will be orphaned. For a large enough validator or set of validators this appears to give them censorship-level control of the chain.

Good point! This PR gives power to the attacker to create a new (virtual) block at any slot, but otherwise all remains similar to the current spec. At first thought, since this attack is in the scope of only >50% attackers, the removal of the requirement to control proposers at attack slots only presents a small degradation in security. Specifically, the attack becomes twice more likely (since currently, there is only a 50% chance that the proposer is under a 50% attacker's control).

In any case, even if this PR presents a degradation of security against >50% attackers, it highly enhances security against smaller, more realistic ~30% attackers.

Copy link
Contributor

@hwwhww hwwhww left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎉

My comments are mostly suggestions on data structures.

Do you plan to rewrite it to a HF1 patch document?

specs/phase0/fork-choice.md Outdated Show resolved Hide resolved
specs/phase0/fork-choice.md Outdated Show resolved Hide resolved
specs/phase0/fork-choice.md Outdated Show resolved Hide resolved
specs/phase0/fork-choice.md Outdated Show resolved Hide resolved
specs/phase0/fork-choice.md Outdated Show resolved Hide resolved
specs/phase0/fork-choice.md Outdated Show resolved Hide resolved
specs/phase0/fork-choice.md Outdated Show resolved Hide resolved
specs/phase0/fork-choice.md Show resolved Hide resolved
specs/phase0/fork-choice.md Outdated Show resolved Hide resolved
@adiasg
Copy link
Contributor Author

adiasg commented Feb 22, 2021

One of the disadvantages of the current proposal is that the performance under bad network conditions becomes worse. Chain growth stops completely if block latency is > SECONDS_PER_SLOT / 3 (= 4 seconds in the current config).
This happens because the gist of the proposed solution for the original attack is to orphan blocks that are not seen within SECONDS_PER_SLOT / 3 by a majority of validators.

@dankrad
Copy link
Contributor

dankrad commented Mar 4, 2021

One of the disadvantages of the current proposal is that the performance under bad network conditions becomes worse. Chain growth stops completely if block latency is > SECONDS_PER_SLOT / 3 (= 4 seconds in the current config).
This happens because the gist of the proposed solution for the original attack is to orphan blocks that are not seen within SECONDS_PER_SLOT / 3 by a majority of validators.

This makes me increasingly worried about this proposal. It seems to me that encountering adversarial network conditions is much more likely than a 30% attacker doing small reorgs or attacking finality (note that they are only 3.3% away from permanently denying finality anyway). It is also a lot cheaper -- I need to invest over 1B$ to execute this attack which is only a degradation of the chain, not even a full DOS. I think even long-term DOS attacks at the network level are much cheaper than that to execute. (Yes, comparing capital and operational cost here, but at 3 orders of magnitude still quite relevant)

I was originally for the (block, slot) fork choice because it seemed more elegant and compatible with FFG. However, now I think the hard synchronicity constraint it introduces is not worth it. I instead suggest moving the FFG target block one slot back to the end of the epoch as suggested in #2174. This makes the attack already 3x less common.

@dankrad
Copy link
Contributor

dankrad commented Mar 4, 2021

Another thing to consider is that the executable beacon chain proposal, without implementing the pre-state root proposal as I suggested here, would add the full Eth1 execution to the block latency (as it has to be executed before you can attest). So already Eth1 blocks with ~2s execution time could effectively DOS the beacon chain, if blocks can't get attestations at >4s delay.

@djrtwo
Copy link
Contributor

djrtwo commented Mar 10, 2021

I'm closing this temporarily (not deleting branch).

We are exploring other directions and will not be including this in Altair

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DO NOT MERGE scope:fork-choice scope:security General protocol security-related items
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants