Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Random BigInt #101

Open
dlesnoff opened this issue Feb 3, 2022 · 3 comments
Open

[Feature Request] Random BigInt #101

dlesnoff opened this issue Feb 3, 2022 · 3 comments

Comments

@dlesnoff
Copy link
Contributor

dlesnoff commented Feb 3, 2022

I would like a function to generate a uniformly (as much as possible) random BigInt inferior to some upper bound.
This would enable to randomize tests.

@pietroppeter
Copy link
Contributor

FYI I learned relatively recently that this randomization of tests is called fuzzing in the InfoSec domain

@dlesnoff
Copy link
Contributor Author

dlesnoff commented Feb 3, 2022

I am not sure we can call this fuzzing, that is why I purposely avoided the term.
Even though my inputs to my tests are random, I just want to check that mathematical properties are verified by random valid integers, to give an heuristic assurance of the correctness of the algorithms, not that the library catches exceptions for purposely invalid, unexpected inputs.
From what I understand based on the Wikipedia page, if the tests I want to write can be called a fuzzer, it would be white-box, generational and dumb (the weakest kind of fuzzer). It is always a good start.

One use case would be eg. for the gcd. We have that for some a, b, c integers :

gcd(gcd(a, b), c) = gcd(a, gcd(b, c))

I guess this does not prove necessarily that the algorithm computes the gcd, but at least that the function coded verifies the functional equation :

f(f(a,b), c) = f(a, f(b, c))

which is (intuitively) rare to happen for a pseudo-random function and for a sufficiently large number of queries with uniformly random inputs.
But the problem is that my algorithm is probably not a pseudo-random function and could fail only for edge-cases. A (good) fuzzer tries to modify the first generated inputs to find these cases. They are often meant to explore all the control flow graph (or a good selection of it with selective pruning).

I have not learnt how to do this. I only write tests so far manually. Automated tests for random numbers would enable users to report such edge cases if they encounter them by chance.

@mratsim
Copy link

mratsim commented Feb 11, 2022

A (good) fuzzer tries to modify the first generated inputs to find these cases. They are often meant to explore all the control flow graph (or a good selection of it with selective pruning).

That's a mutation-based fuzzer (which the more specialized kind of fuzzers).

Here are RNG strategies I use to find bugs in Constantine:

  • A fast prng with 512-bit state / cycle: https://github.com/mratsim/constantine/blob/53c4db7/helpers/prng_unsafe.nim#L40-L88
  • 3 RNG strategies:
    • uniform generation
      func random_unsafe(rng: var RngState, a: var BigInt) =
        ## Initialize a standalone BigInt
        for i in 0 ..< a.limbs.len:
          a.limbs[i] = SecretWord(rng.next())
    • High Hamming weight generation (83% of bits set to 1)
      func random_word_highHammingWeight(rng: var RngState): BaseType =
        let numZeros = rng.random_unsafe(WordBitWidth div 3) # Average Hamming Weight is 1-0.33/2 = 0.83
        result = high(BaseType)
        for _ in 0 ..< numZeros:
          result = result.clearBit rng.random_unsafe(WordBitWidth)
    • Long sequence of 0s and 1s to trigger bugs at words transition (like carries)
      func random_long01Seq(rng: var RngState, a: var openArray[byte]) =
        ## Initialize a bytearray
        ## It is skewed towards producing strings of 1111... and 0000
        ## to trigger edge cases
        # See libsecp256k1: https://github.com/bitcoin-core/secp256k1/blob/dbd41db1/src/testrand_impl.h#L90-L104
        let Bits = a.len * 8
        var bit = 0
        zeroMem(a[0].addr, a.len)
        while bit < Bits :
          var now = 1 + (rng.random_unsafe(1 shl 6) * rng.random_unsafe(1 shl 5) + 16) div 31
          let val = rng.sample_unsafe([0, 1])
          while now > 0 and bit < Bits:
            a[bit shr 3] = a[bit shr 3] or byte(val shl (bit and 7))
            dec now
            inc bit
    
    

The last one in particular has been instrumental in finding CVEs in OpenSSL that mutation-based fuzzers have a hard time to find (because cryptographic code has no branches and mutation-based fuzzer use branches to guide their coverage)

See:

Be prepared for a couple days of fixing bugs after adding those: mratsim/constantine#53 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants