Fix timing attack #2101

adiasg · 2020-10-13T20:37:30Z

The PR introduces a simple fix for the fork choice timing attack presented in this paper: https://arxiv.org/abs/2009.04987

By making the attestation production time unpredictable to the attacker & unique for each validator, we make it harder for an attacker to separately influence the fork choice of disjoint subsets of validators by sending well-timed messages to each set, such that these messages are not gossiped with the other subset of validators before attestations for the slot are produced.

mcdee · 2020-10-14T07:27:42Z

Has there been any consideration to the impact this could have on attestations going "too early" and missing the current block?

From local measurements, at current around 90% of blocks are received by my local validator before the 4 second mark. However, only around 60% of blocks are received before the 2 second mark. Obviously we'd expect a spread of validators across the 2-6 second range, but it does appear that this will reduce the % of validator clients that will use the block of the current slot as in their attestation.

adiasg · 2020-10-14T18:41:38Z

Tuning the numbers according to real-world data was the motivation for converting the constants to configuration parameters 😃

Given your observations about block timings, it makes sense to change the attestation production time to 4 ± 1 sec from the start of the slot.

@mcdee Can you share the collected data so we can analyze this more?

mcdee · 2020-10-14T18:48:35Z

blockdelay.txt

Here's a dump of the last ~7.5K blocks against one of my validator clients, standard prometheus histogram.

adiasg · 2020-10-14T18:58:01Z

Thanks for sharing! Looks like ~75% blocks are seen within 3 seconds and ~85% are seen within 4 seconds of the slot start, so 4 ± 1 sec is a reasonable configuration.

djrtwo

pretty minor comments

Also, we use the magic number of 3 in aggregation broadcast. consider updating to the constant

djrtwo · 2020-10-13T22:19:42Z

specs/phase0/validator.md

+
+A validator should create and broadcast the `attestation` to the associated attestation subnet when the earlier one of these two events occurs:
+  - the validator has received a valid block from the expected block proposer for the assigned `slot`, or
+  - `SECONDS_PER_SLOT/3 + slot_timing_entropy` seconds have elapsed since the start of the `slot` (using the `slot_timing_entropy` generated for this slot)


Suggested change

- `SECONDS_PER_SLOT/3 + slot_timing_entropy` seconds have elapsed since the start of the `slot` (using the `slot_timing_entropy` generated for this slot)

- `SECONDS_PER_SLOT / 3 + slot_timing_entropy` seconds have elapsed since the start of the `slot` (using the `slot_timing_entropy` generated for this slot)

specs/phase0/validator.md

Co-authored-by: Danny Ryan <dannyjryan@gmail.com>

terencechain · 2020-10-15T00:29:50Z

specs/phase0/validator.md

@@ -391,7 +393,13 @@ def get_block_signature(state: BeaconState, block: BeaconBlock, privkey: int) ->

 A validator is expected to create, sign, and broadcast an attestation during each epoch. The `committee`, assigned `index`, and assigned `slot` for which the validator performs this role during an epoch are defined by `get_committee_assignment(state, epoch, validator_index)`.

-A validator should create and broadcast the `attestation` to the associated attestation subnet when either (a) the validator has received a valid block from the expected block proposer for the assigned `slot` or (b) one-third of the `slot` has transpired (`SECONDS_PER_SLOT / 3` seconds after the start of `slot`) -- whichever comes _first_.
+For each `slot`, a validator must generate a uniform random variable `slot_timing_entropy` between `(-SECONDS_PER_SLOT / ATTESTATION_ENTROPY_DIVISOR, SECONDS_PER_SLOT / ATTESTATION_ENTROPY_DIVISOR)` with millisecond resolution and using local entropy.


can the entropy be shared by multiple validators that is served under the same beacon node?

this will simplify the beacon node implementation

Yes, the entropy can be shared by multiple validators that are served under the same beacon node.

Do you mean the source of entropy can be shared or that randomly selected slot_timing_entropy value can be shared by all validators served by the same beacon node?

There are significant performance implications of every individual validator has to select the latest head and create its attestation at a different time. Currently a Validator Client only needs to ask the beacon node to create at most one AttestationData per committee per slot because all validators in that same committee can create an attestation from that AttestationData. And all validators can share the same selected head block.

With this change, if the value of slot_timing_entropy can't be shared, the number of validators a beacon node could support would be significantly reduced as it would need to update fork choice and create a new AttestationData for each individual validator.

This is the gist of this fix:

By making the attestation production time unpredictable to the attacker & unique for each validator, ...

The attestation production time doesn't have to be unique for each and every validator. However, it is absolutely crucial that the attestation production time is unpredictable for anyone who does not control the validator and/or beacon node (for clients where the beacon node is the driver of validator duties). So, validators served by the same beacon node can have the same attestation production time, i.e., they can share the source of the entropy and the actual slot_timing_entropy value.

vbuterin · 2020-10-15T12:38:09Z

I'd agree that this could make it harder for attacks, but I don't think it's a substitute for deeper changes (eg. my "the proposer has 1/4 slot weight" proposal) that provide liveness in the standard model (attacker chooses the latency of every message within the bounds [0, delta]).

The attack under this proposal (ie. this PR) would be: the attacker connects to every node (eg. by connecting to the network with a huge number of nodes and just waiting until they get included in the network and they make up 80%+ of all nodes in the network), and then splits the network 50/50 by broadcasting a set of attestations at exactly the time window when the slot_timing_entropy is right in the middle of its probability distribution. The attacker would have better connections to all the nodes than the nodes have to each other, so them accomplishing this is well within the realm of possibility.

adiasg · 2020-10-15T14:00:16Z

The goal of the PR is to provide some satisfactory mitigation of the attack in v1.0 of the spec, while having relatively low code impact and low risk of the proposed changes. In addition, this fix is definitely backwards-compatible.

Since the attack is feasible & has become well-known by now, it would be a bad move to go ahead with v1.0 without any fixes.

arnetheduck · 2020-10-16T14:44:31Z

specs/phase0/validator.md

+For each `slot`, a validator must generate a uniform random variable `slot_timing_entropy` between `(-SECONDS_PER_SLOT / ATTESTATION_ENTROPY_DIVISOR, SECONDS_PER_SLOT / ATTESTATION_ENTROPY_DIVISOR)` with millisecond resolution and using local entropy.
+
+A validator must create and broadcast the `attestation` to the associated attestation subnet when the earlier one of the following two events occurs:
+  - The validator has received a valid block from the expected block proposer for the assigned `slot`. In this case, the validator must set a timer for `abs(slot_timing_entropy)`. The end of this timer will be the trigger for attestation production.


what is the attack vector of sending the attestation on block receipt? that has some randomness built into it "naturally"?

This is to mitigate the risk from an adversary who has faster connections to all validators than what the rest of the validators have between themselves. There are already some "Layer-0" projects in this space that provide this as a service (either currently, or will do in the near future), e.g., bloXroute and Marlin.

An attacker with this capability would be able to trigger attestation production at a predictable time of its choosing by always being the first one to inform validators about a new block. Hence, adding the timing entropy to make this attack vector unfeasible.

by the way, the time block_arrived + abs(slot_timing_entropy) should be capped at SECONDS_PER_SLOT / ATTESTATION_PRODUCTION_DIVISOR + slot_timing_entropy, in the worst case we'd see the block being sent out at slot + 6s, effectively, instead of slot + 5s being the maximum, which starts being very close to the aggregation cutoff time increasing the risk of loss of reward.

adiasg · 2020-10-16T21:21:40Z

The attack under this proposal (ie. this PR) would be: the attacker connects to every node (eg. by connecting to the network with a huge number of nodes and just waiting until they get included in the network and they make up 80%+ of all nodes in the network), and then splits the network 50/50 by broadcasting a set of attestations at exactly the time window when the slot_timing_entropy is right in the middle of its probability distribution. The attacker would have better connections to all the nodes than the nodes have to each other, so them accomplishing this is well within the realm of possibility.

Yes, this attack is possible even with the fix from this PR in place. However, the chance of success of the attack is substantially lower than before!

Let's label the time when slot_timing_entropy is right in the middle of its probability distribution as t_attack, and assume that the network is synchronous. Then,

half of the validators will have already produced attestations by t_attack. These validators are unaffected by the attack, and make attestations for a single chain.
the other half of the nodes will produce attestations sometime within (t_attack, t_attack + SECONDS_PER_SLOT / ATTESTATION_ENTROPY_DIVISOR).
- The attacker will be sending two different messages to two distinct subsets of validators (say, subset A and B) such that the messages are conflicting. These messages are only useful to the attacker when they are not gossiped between validators in A and B.
- Let min_network_latency be the minimum network latency between validators in A and B. Then, the attacker will only be able to influence the attestations of validators who are scheduled to produce messages between (t_attack, t_attack + min_network_latency). On average, this will be the fraction min_network_latency * ATTESTATION_ENTROPY_DIVISOR / SECONDS_PER_SLOT. With the current values for the constants, this comes out to be min_network_latency (the numerical value in seconds).

nrryuya · 2020-11-09T02:20:47Z

I'm curious on how valuable this fix is (e.g., how weaker the network model where the liveness is guaranteed becomes, how much the fault tolerance changes) compared to the additional complexity of the implementation, the effect on the efficiency of the attestation aggregation, and the risk of unknown side effect (for instance, this fix will affect the analysis of the incentive compatibility of the timing of attesting.).

These messages are only useful to the attacker when they are note gossiped between validators in A and B.

The attacker's attestations are useful even if some portion of validators receive the attackers' attestations from the other subset and switch the chain to vote for. The difference of the two target chains' scores at the end of the current slot is |(the validators in A who switched) - (the validators in B who switched)|. If this difference is smaller than the number of the attacker's attestations in the next slot, the attacker can make the tie again. (Although in the original attack in the Ebb-and-Flow paper the attacker uses a few attestations per slot, the attacker can have much more attestations per slot proportionally to its stake.)

Let min_network_latency be the minimum network latency between validators in A and B

From the above observation, to consider the minimum network latency between the two subsets is not enough. We need to precisely analyze how many attacker's attestations are exchanged within the time window and how large the difference of the scores becomes as a result.

djrtwo · 2021-03-10T23:20:17Z

closing this. likely to go another path

adiasg added 6 commits October 13, 2020 13:23

Fix for timing attack paper

976d450

fix grammar

f74c99e

specify resolution of timing entropy

bcab069

refine fix spec

070c71e

define ATTESTATION_PRODUCTION_DIVISOR and ATTESTATION_ENTROPY_DIVISOR

c215617

explicitly mention fork choice execution at trigger

6776f9d

tuning ATTESTATION_ENTROPY_DIVISOR

a00317d

djrtwo changed the base branch from dev to v1.0-candidate October 14, 2020 21:34

djrtwo reviewed Oct 14, 2020

View reviewed changes

adiasg and others added 2 commits October 14, 2020 15:35

Apply suggestions from code review

d18022c

Co-authored-by: Danny Ryan <dannyjryan@gmail.com>

add entropy to on block trigger

c59519b

djrtwo approved these changes Oct 14, 2020

View reviewed changes

terencechain reviewed Oct 15, 2020

View reviewed changes

paulhauner mentioned this pull request Oct 15, 2020

Fork choice timing attack sigp/lighthouse#1773

Closed

melyalex approved these changes Oct 15, 2020

View reviewed changes

arnetheduck reviewed Oct 16, 2020

View reviewed changes

tersec mentioned this pull request Oct 28, 2020

Phase 0 validator and p2p-interface implementation tracking status-im/nimbus-eth2#1868

Closed

8 tasks

djrtwo force-pushed the v1.0-candidate branch 3 times, most recently from 368863c to 7589af8 Compare November 4, 2020 15:30

Base automatically changed from v1.0-candidate to master November 4, 2020 16:24

hwwhww changed the base branch from master to dev November 6, 2020 09:21

tersec mentioned this pull request Nov 6, 2020

[SEC] Nbc nodes have unnecessary delay when broadcasting attestations status-im/nimbus-eth2#1700

Closed

hwwhww added scope:fork-choice scope:v-guide Validator guide labels Nov 12, 2020

tersec mentioned this pull request Feb 25, 2021

refactor slot loop status-im/nimbus-eth2#2355

Merged

djrtwo closed this Mar 10, 2021

adiasg added the scope:security General protocol security-related items label Mar 11, 2021

mratsim mentioned this pull request Mar 11, 2021

Eth2 Call 59 ethereum/eth2.0-pm#208

Closed

paulhauner mentioned this pull request Mar 15, 2021

HF1 Fork Choice sigp/lighthouse#2236

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix timing attack #2101

Fix timing attack #2101

adiasg commented Oct 13, 2020

mcdee commented Oct 14, 2020

adiasg commented Oct 14, 2020

mcdee commented Oct 14, 2020

adiasg commented Oct 14, 2020

djrtwo left a comment

djrtwo Oct 13, 2020

terencechain Oct 15, 2020

adiasg Oct 15, 2020

ajsutton Oct 15, 2020

adiasg Oct 15, 2020 •

edited

Loading

vbuterin commented Oct 15, 2020

adiasg commented Oct 15, 2020

arnetheduck Oct 16, 2020

adiasg Oct 16, 2020

arnetheduck Feb 25, 2021

adiasg commented Oct 16, 2020 •

edited

Loading

nrryuya commented Nov 9, 2020

djrtwo commented Mar 10, 2021

	- `SECONDS_PER_SLOT/3 + slot_timing_entropy` seconds have elapsed since the start of the `slot` (using the `slot_timing_entropy` generated for this slot)
	- `SECONDS_PER_SLOT / 3 + slot_timing_entropy` seconds have elapsed since the start of the `slot` (using the `slot_timing_entropy` generated for this slot)

Fix timing attack #2101

Fix timing attack #2101

Conversation

adiasg commented Oct 13, 2020

mcdee commented Oct 14, 2020

adiasg commented Oct 14, 2020

mcdee commented Oct 14, 2020

adiasg commented Oct 14, 2020

djrtwo left a comment

Choose a reason for hiding this comment

djrtwo Oct 13, 2020

Choose a reason for hiding this comment

terencechain Oct 15, 2020

Choose a reason for hiding this comment

adiasg Oct 15, 2020

Choose a reason for hiding this comment

ajsutton Oct 15, 2020

Choose a reason for hiding this comment

adiasg Oct 15, 2020 • edited Loading

Choose a reason for hiding this comment

vbuterin commented Oct 15, 2020

adiasg commented Oct 15, 2020

arnetheduck Oct 16, 2020

Choose a reason for hiding this comment

adiasg Oct 16, 2020

Choose a reason for hiding this comment

arnetheduck Feb 25, 2021

Choose a reason for hiding this comment

adiasg commented Oct 16, 2020 • edited Loading

nrryuya commented Nov 9, 2020

djrtwo commented Mar 10, 2021

adiasg Oct 15, 2020 •

edited

Loading

adiasg commented Oct 16, 2020 •

edited

Loading