# Statistiques

In [1]:
import pickle as pkl

In [2]:
with open("stats.pkl", "rb") as f:
    stats = pkl.load(f)

## Nombre de bits

Une première idée est de compter le nombre de bits. En effet, on aimerait qu'un générateur de nombres pseudo-aléatoire envoie à peu près autant de "0" que de "1".

In [3]:
bit_stats = ["{:08b}".format(el) for el in stats]
big_string = "".join(bit_stats)
number_of_0 = big_string.count("0")
print(number_of_0/len(big_string))

0.4999715625


On est presque à 50% sur 10000000 de nombres générés (soit 80000000 de bits) - c'est donc bon signe.

## Transformée de Fourier

In [9]:
import numpy
import scipy.fft
import scipy.special
def spectral(bin_data: str):
    """
    Note that this description is taken from the NIST documentation [1]
    [1] http://csrc.nist.gov/publications/nistpubs/800-22-rev1a/SP800-22rev1a.pdf
    The focus of this test is the peak heights in the Discrete Fourier Transform of the sequence. The purpose of
    this test is to detect periodic features (i.e., repetitive patterns that are near each other) in the tested
    sequence that would indicate a deviation from the assumption of randomness. The intention is to detect whether
    the number of peaks exceeding the 95 % threshold is significantly different than 5 %.
    :param bin_data: a binary string
    :return: the p-value from the test
    """
    n = len(bin_data)
    plus_minus_one = []
    for char in bin_data:
        if char == '0':
            plus_minus_one.append(-1)
        elif char == '1':
            plus_minus_one.append(1)
    # Product discrete fourier transform of plus minus one
    s = scipy.fft.fft(plus_minus_one)
    half_len = n // 2
    modulus = numpy.abs(s[0:half_len])
    tau = numpy.sqrt(numpy.log(1 / 0.05) * n)
    # Theoretical number of peaks
    count_n0 = 0.95 * (n / 2)
    # Count the number of actual peaks m > T
    count_n1 = len(numpy.where(modulus < tau)[0])
    # Calculate d and return the p value statistic
    d = (count_n1 - count_n0) / numpy.sqrt(n * 0.95 * 0.05 / 4)
    p_val = scipy.special.erfc(abs(d) / numpy.sqrt(2))
    return p_val

In [10]:
spectral(big_string)

0.8736478816189553