{{ message }}
/ consensus-specs Public

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

# Possible alternative numer-theoretic shuffling algorithm#323

Closed
opened this issue Dec 14, 2018 · 13 comments
Closed

# Possible alternative numer-theoretic shuffling algorithm #323

opened this issue Dec 14, 2018 · 13 comments
Labels
general:RFC Request for Comments

### Motivation

Construct a shuffling algorithm where you can compute the value in the shuffled list at any specific position relatively cheaply without computing all of the other values at the same time. This could be used to replace the current `shuffle`.

### Helpers

```def is_prime(x):
return [i for i in range(2, int(x**0.5)+1) if x%i == 0] == []```

### Algorithm

```def value_at_position(n, value, seed):
# We do the shuffling mod p, the lowest prime >= n, but if we actually shuffle into
# the "forbidden" [n...p-1] slice we just reshuffle until we get out of that slice
p = n
while not is_prime(p):
p += 1
# x -> x**power is a permutation mod p
power = 3
while (p-1) % power == 0 or not is_prime(power):
power += 2
for round in range(40):
a = int.from_bytes(seed[(round % 8)*4: (round % 8)*4 + 4], 'big')
value = (pow(value, power, p) + a) % p
while value >= n:
value = (pow(value, power, p) + a) % p
# Update the seed if needed
if round % 8 == 0:
seed = hash(seed)
return value

def shuffle(values, seed):
return [values[value_at_position(len(values), i, seed)] for i in range(len(values))]```

Note that the above is a maximally simple definition, not an optimal implementation. An optimal implementation would calculate the prime, the exponent, and the `a` values once, and pass them as inputs to `value_at_position`, which would then need to do much less work per value.

The main weaknesses are: (i) the computation of the total shuffle is slower even with optimizations, and (ii) the security argument is more heuristic ("it repeats `x -> x**3 + k[i]` so it's like MIMC") than the current Fisher-Yates shuffle. However, the strengths may be worth it, especially if we wish to cease storing the shuffling in the state. changed the title Possible alternative shuffling algorithm Possible alternative numer-theoretic shuffling algorithm Dec 14, 2018

### mratsim commented Dec 14, 2018 • edited

What are the security properties that are needed?

### Quick analysis of the bottlenecks

The main bottleneck in the current `Fisher-Yates` shuffling is the repeated `modulo` in the loop as it's an expensive operation. In the new proposed shuffling there are `gcd`, `pow` and `modulo`.

In my opinion the main advantage of Fisher-Yates is the ability to do it in-place, however I suppose to be able to rollback state, most clients will not implement it in-place but in a temporary location instead.

### Alternative

In case we need a temporary location, I think reservoir sampling will be superior in terms of performance while preserving equi-probabilities of all sequences.

The algorithm is very simple and doesn't involve expensive operation, from Wikipedia (array indexing starts at one)

```ReservoirSample(S[1..n], R[1..k])
// fill the reservoir array
for i = 1 to k
R[i] := S[i]

// replace elements with gradually decreasing probability
for i = k+1 to n
j := random(1, i)   // important: inclusive range
if j <= k
R[j] := S[i]```

We only need to couple it with a prng. For example XorShift128+ is used by all major browsers (and could be a eWASM primitive):

```#include <stdint.h>

/* The state must be seeded so that it is not all zero */
uint64_t s;

uint64_t xorshift128plus(void) {
uint64_t x = s;
uint64_t const y = s;
s = y;
x ^= x << 23; // a
s = x ^ y ^ (x >> 17) ^ (y >> 26); // b, c
return s + y;
}```

Or the latest Xoroshiro128+ by the same author, which is significantly faster and with much better statistical quality.

Implementation in Nim (disclaimer this is the default Nim PRNG):

```type
Rand* = object ## State of the random number generator.
## The procs that use the default state
## are **not** thread-safe!
a0, a1: ui

var state = Rand(  # <---- We can use our own seed instead
a0: 0x69B4C98CB8530805u64,
a1: 0xFED1DD3004688D67CAu64)

proc rotl(x, k: ui): ui =
result = (x shl k) or (x shr (ui(64) - k))

proc next*(r: var Rand): uint64 =
## Uses the state to compute a new ``uint64`` random number.
let s0 = r.a0
var s1 = r.a1
result = s0 + s1
s1 = s1 xor s0
r.a0 = rotl(s0, 55) xor s1 xor (s1 shl 14) # a, b
r.a1 = rotl(s1, 36) # c```

This is much more hardware friendly.

### Drawback

Like the current Fisher-Yates, you need to keep a state.

### mratsim commented Dec 15, 2018

 I've been thinking over this and unfortunately reservoir sampling wouldn't work as is. We need a reservoir that hold the final selection, so it must be the size of the list to shuffle, i.e. we would start with the list and then we remove (~sample) elements randomly from it, so basically it just degrade into random sampling without replacement. However when doing random sampling the data structure used is quite important and a list/vector is not the most efficient one (in my benchmarks a Fenwick tree is very efficient) but if we need to reorganize into a new data structure we might as well do copy+Fisher Yates. An alternative scheme would be to have a smaller reservoir (say 8 items) and add items ejected from the reservoir to the final list. Unfortunately even if we start from a random point in the original list, the order would be correlated to the original order. So in short I think reservoir sampling is not suitable.

### vbuterin commented Dec 15, 2018

 The main desired property is being maximally close to a random permutation. Specifically, it should be maximally computationally difficult to find seeds that ensure that in a shuffling of `1...N`, any specific subset `x1, x2, x3.... x[k]` are in a specific range `[v ..... v+k]`. The reason why the number-theoretic shuffle above is interesting is that you don't need to compute the whole output to compute a small portion of the output. This makes it much easier to use in light clients, and to verify blocks generally.

### vbuterin commented Dec 15, 2018

 I just optimized the code a bit; it can shuffle 100000 values in one second (vs 1 million in one second for the Fisher-Yates shuffle). That said, the hash complexity is smaller, so I suspect in an optimized C implementation the number-theoretic shuffle will not be much smaller than out current Fisher-Yates shuffle, as in python the hashes are optimized but the rest of the code is not, leading to hashing costs being understated.

### vbuterin commented Dec 18, 2018 • edited by djrtwo

 Here is a hash-based alternative. Here's the core: ```def numhash(x, i, seed, modulus): assert 0 <= i < 4 return (int.from_bytes(hash(x.to_bytes(32, 'big') + seed), 'big') // modulus**i) % modulus def feistel(x, modulus, seed): assert is_perfect_square(modulus) and modulus < 2**50 h = int(modulus ** 0.5) L, R = x//h, x%h for i in range(4): new_R = (L + numhash(R, i, seed, h)) % h L = R R = new_R return L * h + R``` Now for the actual function, let `modulus` be the smallest perfect square greater than or equal to `n`. To permute some `value`, set `value = feistel(value, modulus, seed)` and if needed repeat until the result is less than `n`, just as above. This takes 4 hashes to compute for a single number, but to permute an entire list you only need `~sqrt(n)` hashes.

### vbuterin commented Dec 18, 2018 • edited

 I implemented both options and the status quo here: https://github.com/ethereum/research/tree/master/shuffling Here's the timing test output, testing both computing a full shuffle of 100000 and computing just a specific sub-committee of 500 out of the 100000: ```Testing prime shuffle [40388, 24854, 44555, 69180, 37292, 27818, 85124, 51675, 75163, 16592] [40388, 24854, 44555, 69180, 37292, 27818, 85124, 51675, 75163, 16592] Total runtime: 1.0141525268554688 Runtime to compute committee: 0.022953271865844727 Testing feistel shuffle [82855, 3100, 89704, 87662, 7830, 16014, 57626, 95313, 53632, 97853] [82855, 3100, 89704, 87662, 7830, 16014, 57626, 95313, 53632, 97853] Total runtime: 0.10488557815551758 Runtime to compute committee: 0.00408935546875 Testing Fisher-Yates shuffle (status quo) [93044, 39644, 88989, 78137, 17662, 17187, 41433, 85069, 64061, 6647] Total runtime: 0.08116364479064941```

### mratsim commented Dec 19, 2018 • edited

 Feistel shuffling is impressively fast. I've transformed the current state-of-the-art sampling technique from natural language processing into a shuffling algorithm in ethereum/research#98. Originally it was for weighted sampling hence why it is rearranging input to efficiently check cumulative probability distributions. There is an initialisation cost but further sampling are 6.5x cheaper. ```Testing prime shuffle [40388, 24854, 44555, 69180, 37292, 27818, 85124, 51675, 75163, 16592] [40388, 24854, 44555, 69180, 37292, 27818, 85124, 51675, 75163, 16592] Total runtime: 1.651440143585205 Runtime to compute committee: 0.02932882308959961 Testing feistel shuffle [82855, 3100, 89704, 87662, 7830, 16014, 57626, 95313, 53632, 97853] [82855, 3100, 89704, 87662, 7830, 16014, 57626, 95313, 53632, 97853] Total runtime: 0.15117597579956055 Runtime to compute committee: 0.005218982696533203 Testing Fisher-Yates shuffle [93044, 39644, 88989, 78137, 17662, 17187, 41433, 85069, 64061, 6647] Total runtime: 0.1256880760192871 Testing F+tree sampling [93044, 39644, 88989, 78137, 17662, 17187, 41433, 85069, 64061, 6647] [50405, 51879, 54080, 30514, 76290, 60097, 83697, 19454, 68194, 85273] [75628, 50172, 45023, 21349, 80036, 81698, 2829, 994, 88511, 20356] Total runtime: 1.5932817459106445 Runtime to compute first committee: 0.053829193115234375 Runtime to compute next committee: 0.008832931518554688```

### JustinDrake commented Dec 21, 2018

 Benedikt Bunz points to this paper as a potential candidate.

### drozdziak1 commented Jan 17, 2019

 Is it certain that the present shuffling approach is going to be ditched?

### JustinDrake commented Jan 17, 2019

 @drozdziak1 I'd say 80% :) Light-client friendly shuffles seem like a big win, and they are possible. It's now an engineering/cryptography question to find a really efficient one, maybe even a SNARK/STARK-friendly one.

This was referenced Jan 28, 2019

### esaulpaugh commented Jan 31, 2019 • edited

 It sounds like a CSPRNG is a requirement. Anything less just seems sketchy when deliberate attacks are expected. As they say, attacks only get better

### th4s commented Feb 1, 2019

 @vbuterin have you run some random number tests (e.g. dieharder) against the primenumber implementation or is this algorithm some well-known CSPRNG?

### djrtwo commented Feb 12, 2019

 closed by adding "swap or not" #563

to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
general:RFC Request for Comments
Projects
None yet
Development

No branches or pull requests