Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential array-out-of-bound runtime error in get_attesting_indices #2500

Open
booleanfunction opened this issue Jun 30, 2021 · 1 comment
Open
Labels
scope:security General protocol security-related items

Comments

@booleanfunction
Copy link
Contributor

What the problem is and where it is located

The following relates to eth2.0-specs/specs/phase0/beacon-chain.md.

It appears that the committee created in get_attesting_indices can exceed length of MAX_VALIDATORS_PER_COMMITTEE and hence this will cause an array-out-of-bound runtime error when accessing bits to return the set of attesting indices. The problem occurs when the number of active validators exceeds 2**22 = 4,194,304.

code1

I would note that strictly speaking, the SSZ definition of a bitlist means that |bits| <= MAX_VALIDATORS_PER_COMMITTEE and so a more accurate requirement for get_attesting_indices is that:

|committee| <= |bits| <= MAX_VALIDATORS_PER_COMMITTEE (i.e.2048)

however, the purpose of this analysis is to understand when
|committee| > MAX_VALIDATORS_PER_COMMITTEE.

How was the problem discovered

The problem was discovered while using Dafny to formally verify the Beacon Chain Phase 0 spec (eth2.0-dafny).

As I think the issue can be difficult to detect, particularly with the use of floor division and the dependency of parameter calculations between functions, I thought it would be useful to step through the calculations involved, which I do below.

Results of the analysis

The key results are that:

  • when 32 <= number of active validators <= 4,194,304, the length of committee will range between 1 and MAX_VALIDATORS_PER_COMMITTEE (inclusive) for all data.slot and data.index combinations
  • when 4,194,304 < number of active validators < 4,196,352, committee will have length greater than MAX_VALIDATORS_PER_COMMITTEE for at least one data.slot and data.index combination
  • when 4,196,352 <= number of active validators, committee will have length greater than MAX_VALIDATORS_PER_COMMITTEE for all data.slot and data.index combinations

And for completeness, please note that:

  • when number of active validators < 32, committee will have length 0 for some data.slot and data.index combinations

Possible solution

If in get_attesting_indiceswe assume that |bits| = MAX_VALIDATORS_PER_COMMITTEE then a basic solution is to modify compute_committee such that:

end = min(start + MAX_VALIDATORS_PER_COMMITTEE, (len(indices) * (index+1)) // count).

Detailed analysis

Context

get_attesting_indices is used as part of the rewards and penalties, justification and finalisation, as well as process_attestion.

For context I thought it was useful to show the function call paths.

functions

Tracing the problem

To determine how the length of committee is calculated we must look at get_beacon_committee.

code2

The output of get_beacon_committee is then determined by the call to compute_committee.

code3

If we start with the return statement in compute_committee, the returned sequence will have a length determined by range(start, end). Specifically, range(start, end) will yield a sequence of length end-start and hence:

|compute_committee(indices, seed, index, count)| 
= end-start
= (len(indices) * (index+1)) // count - (len(indices) * index) // count

When compute_committee is called from get_beacon_committee we see that:

  • indices = get_active_validator_indices
  • seed = get_seed
  • index = (slot % SLOTS_PER_EPOCH) * committees_per_slot + index
  • count = committees_per_slot * SLOTS_PER_EPOCH

(Note that the index parameter in get_beacon_committee is different to the index parameter in compute_committee).

Making these substitutions, at the get_beacon_committee level we determine end and start as follows:

  • end = (len(get_active_validator_indices) * (((slot % SLOTS_PER_EPOCH) * committees_per_slot + index)+1)) // (committees_per_slot * SLOTS_PER_EPOCH)
  • start = (len(get_active_validator_indices) * ((slot % SLOTS_PER_EPOCH) * committees_per_slot + index)) // (committees_per_slot * SLOTS_PER_EPOCH)

This is also a good time to mention that committees_per_slot is determined by a call to get_committees_per_slot and is calculated as follows:

committees_per_slot = max(1, min(MAX_COMMITTEES_PER_SLOT, len(get_active_validator_indices) // SLOTS_PER_EPOCH // TARGET_COMMITTEE_SIZE))

committees_per_slot is therefore a function of the number of active validators and 0 < committees_per_slot <= 64.

We could now substitute the formula for committees_per_slot into end and start but for the purpose of this analysis we can conclude that end - start is a function of:

  • len(get_active_validator_indices),
  • slot % SLOTS_PER_EPOCH, and
  • index (i.e. the committee index parameter of get_beacon_committee).

It is also useful to note the following bounds on these values:

  • 0 <= slot % SLOTS_PER_EPOCH < 32
  • 0 <= index < committees_per_slot

Putting all of this together, when get_beacon_committee(state, data.slot, data.index) is called from get_attesting_indices for a state of a particular number of active validators, it can be done so for multiple (data.slot,data.index) combinations according to the above (slot % SLOTS_PER_EPOCH,index) bounds.

Further in relation to the index bounds, as committees_per_slot depends on the number of activate validators, the range of index also depends on the number of active validators. Once the number of active validators is sufficiently large, there will be a maximum of 32 * 64 = 2048 of these (slot % SLOTS_PER_EPOCH,index) combinations to consider for a state with a particular number of active validators.

To determine for which (len(get_active_validator_indices), slot % SLOTS_PER_EPOCH,index) combinations we find that end - start > MAX_VALIDATORS_PER_COMMITTEE we need to analyse the above formulas for end and start. Of course, the use of floor division (i.e. //) makes this more difficult than regular division and hence here I will just show the overall results but I will include several specific examples where a bound breach occurs (i.e. where end - start > MAX_VALIDATORS_PER_COMMITTEE).

I thought some visualisations might assist.

Let's firstly consider how many committees_per_slot there will be for 0 < len(get_active_validator_indices) < 4,250,000.

analysis1

We see that once there are 2**18=262,144 or more active validators, committees_per_slot will reach its upper bound of 64.

The value of committees_per_slot consequently determines the number of (slot % SLOTS_PER_EPOCH,index) combinations to consider for a state with a particular number of active validators.

In the next graph we plot the number of (slot % SLOTS_PER_EPOCH,index) combinations, as well as how many of these input combinations will produce end - start > MAX_VALIDATORS_PER_COMMITTEE and hence ultimately how many |committee| > MAX_VALIDATORS_PER_COMMITTEE breaches occur at a given number of active validators.

analysis2

We observe that the number of (slot % SLOTS_PER_EPOCH,index) combinations reaches its maximum of 32 * 64 = 2048 at 2**18=262144 active validators (i.e. as is expected given the Figure 1 in which committees_per_slot reaches its maximum at that number of active validators).

We can also observe that the number of these combinations where |committee| = end - start > MAX_VALIDATORS_PER_COMMITTEE is zero until the number of active validators exceeds 2**22 = 4,194,304.

If we focus on this particular area of the graph, we see:

analysis3

Hence, once the number of active validators exceeds 2**22 = 4,194,304 we can observe that there are (slot % SLOTS_PER_EPOCH,index) combinations such that |committee| > MAX_VALIDATORS_PER_COMMITTEE. Further, once the number of active validators reaches 2**22 + 2**11 = 4,196,352 we can observe that all (slot % SLOTS_PER_EPOCH,index) combinations will result in |committee| > MAX_VALIDATORS_PER_COMMITTEE.

Examples

We can now look at a couple of examples to see which (slot % SLOTS_PER_EPOCH,index) combinations cause the problem for a set number of active validators.

For example, when the number of active validators is 4,194,305 we can see that (slot % SLOTS_PER_EPOCH=31,index=63) will be the only committee such that |committee| > MAX_VALIDATORS_PER_COMMITTEE. i.e. as indicated by the red block.

eg1

Where as when the number of active validators is for example 4,194,355, we can see that numerous (slot % SLOTS_PER_EPOCH,index) combinations result in |committee| > MAX_VALIDATORS_PER_COMMITTEE. i.e. as indicated by the red blocks.

eg2

I was able to prove the generalised results using formal verification in Dafny (eth2.0-dafny) but to create the visualisations shown here I simply created a Jupyter notebook so as to explore the equations numerically.

@djrtwo
Copy link
Contributor

djrtwo commented Jul 6, 2021

This is a known bound chosen and based on the instantiation of the beacon chain on mainnet Ethereum, assuming a maximum suppy of ~130M ETH and thus a maximum number of active validators.

This is noted briefly in the annotated spec under MAX_VALIDATORS_PER_COMMITTEE.

Two quick comments:

  1. if there is a simple way to protect against this edge case, we can soft for it it without any issue. That said, any even moderate complexity on this edge is likely not worth it today given the bounds of mainnet ETH
  2. The most dangerous thing is likely if these values get naively parroted to another network instantiation that has different minimum ETH requirements or a different ETH supply. OR if ethereum beacon chain mainnet was later upgraded to do a validator split to require much less ETH per validator which could unexpectedly hit this threshold. At the very least this should be well documented somewhere.

[side note: there are many configuration relationships and details that are not well documented and would be weird foot-guns if trying to change configs for other networks. there is likely a whole project here on documenting these relationships]

@hwwhww hwwhww added the scope:security General protocol security-related items label Jul 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scope:security General protocol security-related items
Projects
None yet
Development

No branches or pull requests

5 participants
@djrtwo @hwwhww @booleanfunction and others