Removed hard-coded constant in `sortition.go`. #3558

Olshansk · 2022-02-03T14:00:45Z

Summary

Remove a hardcoded constant in sortition.go which was
used as the denominator in determining the selection ratio.

This clarifies what the maximum possible output size is of
the output VRF based on the SHA algorithm used to generate it.

Purpose

This is simply a code quality change to make it easier to read and
understand without any focus on new functionality or performance.

It is unlikely that the VRF SHA length used by Algorand will changed,
so the generalization to a dynamic length based on the length of the
Crypto digest is only for the reader.

The use of an init function to initialize the maxFloat once rather than
anytime sortition runs is a minor optimization, but likely to be neglibile
in the grand scheme of things.

Test Plan

Due to the usage of C src code, a simple go test -v ./data/committee/sortition/
cannot be executed without the proper configurations. However, circle CI
should be able to execute the unit tests automatically and verify the changes.

Recommendation: Running the recommended make targets to test the changes
did not automatically due to missing dependencies, so potentially a Dockerfile
could help resolve this for future contributors.

Remove a hardcoded constant in sortition.go which was used as the denominator in determining the selection ratio. This clarifies what the maximum possible output size is of the output VRF based on the SHA algorithm used to generate it. Due to the usage of C src code, a simple `go test -v ./data/committee/sortition/` cannot be executed without the proper configurations. However, circle CI should be able to execute the unit tests automatically and verify the changes.

CLAassistant · 2022-02-03T14:01:00Z

All committers have signed the CLA.

jannotti · 2022-02-03T14:02:07Z

data/committee/sortition/sortition.go

 	precision := uint(8 * (len(vrfOutput) + 1))
-	max, b, err := big.ParseFloat("0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff", 0, precision, big.ToNearestEven)
+	maxFloatString := fmt.Sprintf("0x%s", strings.Repeat("f", crypto.DigestSize*2+1))


This can't be right, can it? Surely an odd number of hex digits is not what you want.

You are correct. This was something I added while manually testing. Thanks!

Per the update in my comment, might be easier to add Dockerfile so all the dependencies don't have to be installed locally to run the unit tests. I'm relying on circle CI right now.

I'd prefer to see the fmt.Sprintf done once, rather than each time, as well.

Added a local var since you can't assign the output of formatting to a constant.

For that matter, perhaps you could move max out of the function as well, so it's not reparsed each time. I suspect the PR would be more palatable if it had a performance improvement, rather than only making the constant explicit. (On the explicitness front, why not strings.Repeat("ff", crypto.DIgestSize)?)

tsachiherman · 2022-02-03T15:21:34Z

@Olshansk - can I suggest a small improvement for your PR ?
I've implemented these proposals in #3559.

tsachiherman · 2022-02-03T15:25:20Z

@Olshansk - you'll need to sign the CLAassistant before the proposed code changes could be merged in.

Olshansk · 2022-02-04T15:51:37Z

@tsachiherman Thanks for the suggestion!

TIL about init in Golang so this was super useful. I also updated the description of the PR with a Purpose to provide some extra context for the need for this change. It's very much a nice-to-have.

Also signed the agreement.

codecov-commenter · 2022-02-04T16:29:50Z

Codecov Report

Merging #3558 (1645da6) into master (6230dc5) will decrease coverage by 0.00%.
The diff coverage is 62.50%.

@@            Coverage Diff             @@
##           master    #3558      +/-   ##
==========================================
- Coverage   47.58%   47.58%   -0.01%     
==========================================
  Files         370      370              
  Lines       60118    60121       +3     
==========================================
  Hits        28606    28606              
- Misses      28198    28203       +5     
+ Partials     3314     3312       -2

Impacted Files	Coverage Δ
data/committee/sortition/sortition.go	`82.35% <62.50%> (-3.37%)`	⬇️
network/wsPeer.go	`65.83% <0.00%> (-3.34%)`	⬇️
catchup/peerSelector.go	`98.95% <0.00%> (-1.05%)`	⬇️
ledger/acctupdates.go	`66.53% <0.00%> (-0.20%)`	⬇️
catchup/service.go	`69.38% <0.00%> (+0.74%)`	⬆️
ledger/tracker.go	`75.75% <0.00%> (+1.29%)`	⬆️
crypto/merkletrie/node.go	`93.48% <0.00%> (+1.86%)`	⬆️
crypto/merkletrie/trie.go	`68.61% <0.00%> (+2.18%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6230dc5...1645da6. Read the comment docs.

jannotti

This looks good to me.

tsachiherman

Thanks for the changes.

Olshansk · 2022-02-07T02:11:41Z

@tsachiherman @jannotti I was wondering if either of you could help me understand the logic behind sortition_binomial_cdf_walk.

I've looked at the paper and read through some online articles (example) but am still struggling to understand this:

In short, I understand that we use VRF to compute a proof and a hash, which is normalized to a value in [0, 1) using the hashlen (i.e. the ratio). The binomial distribution is built using the total voting power, and the number of expected leaders. We then compute the CDF for each value j less than the specific node's voting power and compare it boundary of the CDF.

    // Found the correct boundary, break
    if (ratio <= boundary) {
      return j;
    }

However, I'm still struggling to understand the significance of j in the check above. Are we saying that a single node has the opportunity to vote with at most money tokens (i.e. their voting power), but based on the random value of the VRF hash, it is going to be some value less than that based on the CDF?

jannotti · 2022-02-07T15:37:33Z

It's not something I'm super familiar with, but I think you've got the idea. Suppose I have 1M algos. I don't "win" the VRF lottery will all of them, or lose with all of them. Some fraction of them win and some fraction loses, based on the outcome of the VRF.

Olshansk · 2022-02-08T03:50:44Z

Yea, the high level idea makes sense and even most of the intricacies except for that one… Maybe it's something we could figure out together?

I've looked at section 5 of the whitepaper and my understanding is:

Assume / let the following:

Each user as a Vk (Verification Key) and a Sk (Secret Key)
Vk of each user has been broadcasted to all other users in the network
Seed S is determined and publically known (but only computed AFTER Vk has been broadcasted)
Every user i has a voting power Wi where W is the voting power of the whole network

p, hash = VRFsk(seed)

p is the proof that can be verified with VRFvk(seed)
hash is uniformly distributed and is of size hashlen

The binomial distribution is set to:

Number of trials is the individual nodes voting power (each unit is an individual attempt to be selected to protect against Sybil attacks)
The probably of each trial is number of expected candidates in committee / total voting power (each unit is an individual sub-user)

After re-reading it, I actually think the core of the “magic” is here:

Hash / 2^hashlen normalizes the uniformly distributed VRF output
Using the binomial distribution, we can find out if EXACTLY j sub-users are selected.

The maximum value for j is the maximum number of their sub-users that may be selected. However, if exactly j sub-users are not selected (based on the normalized VRF output), it doesn’t mean that a value less than j might not be selected. E.g. if the total voting power is 1000 ALGO, and a single node has 100, they should still have an opportunity to be a potential committee member if only 50 of their ALGO is elected as their voting round for that view change.

My understanding is that this is the reason we’re using a CDF of the binomial distribution and accepting any value of j smaller than their individual voting power as the value to be a candidate.

## Summary Following up on #3558, this PR update the THANKS.md file. ## Test Plan Not required for this change.

jannotti reviewed Feb 3, 2022

View reviewed changes

Olshansk added 2 commits February 3, 2022 09:04

Remove the +1

d27eabd

Create a local variable rather than recompute it each time

9c41c54

tsachiherman self-requested a review February 3, 2022 15:29

tsachiherman added the quality Improve code quality or style label Feb 3, 2022

Olshansk added 2 commits February 4, 2022 10:45

Moved maxFloat generation into an init function

b31c833

Replace the *2 and with

1645da6

tsachiherman mentioned this pull request Feb 4, 2022

Tsachi/minor sortition fix2 #3559

Closed

tsachiherman requested a review from jannotti February 4, 2022 16:32

jannotti approved these changes Feb 4, 2022

View reviewed changes

tsachiherman approved these changes Feb 4, 2022

View reviewed changes

tsachiherman merged commit c765903 into algorand:master Feb 4, 2022

tsachiherman mentioned this pull request Feb 8, 2022

update THANKS.md file #3595

Merged

tsachiherman added a commit that referenced this pull request Feb 8, 2022

update thanks file (#3595)

9cc2498

## Summary Following up on #3558, this PR update the THANKS.md file. ## Test Plan Not required for this change.

onetechnical mentioned this pull request Feb 11, 2022

go-algorand 3.4.0-beta #3613

Merged

egieseke mentioned this pull request Feb 16, 2022

Script to generate go-algorand release notes #3520

Closed

onetechnical mentioned this pull request Mar 1, 2022

go-algorand 3.4.2-stable #3700

Merged

Olshansk mentioned this pull request Mar 14, 2022

Libraries for Cryptographic Sortition and VRF pokt-network/pocket#37

Merged

jessicadaugherty mentioned this pull request Feb 13, 2023

[Consensus] Leader Election - Productionize the sortition algorithm pokt-network/pocket#324

Open

16 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Removed hard-coded constant in `sortition.go`. #3558

Removed hard-coded constant in `sortition.go`. #3558

Olshansk commented Feb 3, 2022 •

edited

CLAassistant commented Feb 3, 2022 •

edited

jannotti Feb 3, 2022

Olshansk Feb 3, 2022

Olshansk Feb 3, 2022

jannotti Feb 3, 2022

Olshansk Feb 3, 2022

jannotti Feb 3, 2022

Olshansk Feb 4, 2022

tsachiherman commented Feb 3, 2022

tsachiherman commented Feb 3, 2022

Olshansk commented Feb 4, 2022

codecov-commenter commented Feb 4, 2022 •

edited

jannotti left a comment

tsachiherman left a comment

Olshansk commented Feb 7, 2022

jannotti commented Feb 7, 2022

Olshansk commented Feb 8, 2022 •

edited

Removed hard-coded constant in sortition.go. #3558

Removed hard-coded constant in sortition.go. #3558

Conversation

Olshansk commented Feb 3, 2022 • edited

Summary

Purpose

Test Plan

CLAassistant commented Feb 3, 2022 • edited

jannotti Feb 3, 2022

Choose a reason for hiding this comment

Olshansk Feb 3, 2022

Choose a reason for hiding this comment

Olshansk Feb 3, 2022

Choose a reason for hiding this comment

jannotti Feb 3, 2022

Choose a reason for hiding this comment

Olshansk Feb 3, 2022

Choose a reason for hiding this comment

jannotti Feb 3, 2022

Choose a reason for hiding this comment

Olshansk Feb 4, 2022

Choose a reason for hiding this comment

tsachiherman commented Feb 3, 2022

tsachiherman commented Feb 3, 2022

Olshansk commented Feb 4, 2022

codecov-commenter commented Feb 4, 2022 • edited

Codecov Report

jannotti left a comment

Choose a reason for hiding this comment

tsachiherman left a comment

Choose a reason for hiding this comment

Olshansk commented Feb 7, 2022

jannotti commented Feb 7, 2022

Olshansk commented Feb 8, 2022 • edited

Removed hard-coded constant in `sortition.go`. #3558

Removed hard-coded constant in `sortition.go`. #3558

Olshansk commented Feb 3, 2022 •

edited

CLAassistant commented Feb 3, 2022 •

edited

codecov-commenter commented Feb 4, 2022 •

edited

Olshansk commented Feb 8, 2022 •

edited