# PV181 Seminar 01 - RNG (python)

This notebook contains python code for several tasks treated in this seminar. 

# Task 0: ANSI C example code
Following code was taken from [ANSI C standard](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf#page=324) and simplified to other portable implementation (according to implementation of [rand()](https://code.woboq.org/userspace/glibc/stdlib/random_r.c.html#__random_r)) of seeding function `srand` and function for generation `rand()`.

```
static unsigned long int next = 1;

void srand(unsigned int seed)
{
    next = seed;
}

int rand(void) // RAND_MAX assumed to be 32767
{
    return next = (next * 1103515245 + 12345) & 0x7fffffff;
}
```

# Task 1: PRNG in general
Following class defines generic PRNG. Use`class PRNG` and create generator (object `gcc_old`) equivalent to C code from the '''Task 0'''. Just set the functions (`Init`, `Trans`, `Out`) to appropriate functions (one function needs to be defined). Use methods `srand, rand()` to seed the generator with `0` and generate (print out) 10 random values. 

In [1]:
def Id(x): #lambda x: x is equivalent to Id
    return x

class PRNG:
    def __init__(self, Init = Id, Trans = Id, Out = Id ):
        self.Init = Init
        self.Trans = Trans
        self.Out = Out
        self.state = self.Init(1)

    def srand(self, seed):
        self.state = self.Init(seed)

    def rand(self):
        self.state = self.Trans(self.state)
        return self.Out(self.state)

gcc_old = PRNG()

# Questions 1, 2:  problematic `gcc_old` 
Attacker observed value `x` generated by the `gcc_old` generator:
1. He is able to predict **next** values. Why?
2. He is able also to reconstruct **previous** values. Why?
 


# Task 2: Inverse LCG
Implement "inverse" generator `gcc_old_inv = PRNG()` that generates the same values as `gcc_old` but in the oposite direction. you need to revert the `Init`, `Trans` and `Out` functions of the `gcc_old`.

**Hint**: It might be useful to use`pow(a,-1, m) ` that computes inverse of a (i.e. $a^{-1} \pmod m$). But negative exponent is allowed in Python 3.8. 

# Questions 3, 4: combination of weak sources
Following code collects entropy based on the time (time stamps in nanoseconds).

1. Why **xor** operation (denoted usually by $\oplus$) of "random" values is not appropriate in this case? 
2. What (cryptographic?) function would be more appropriate and why? 

In [3]:
import time
stamps = [time.time_ns() for i in range(100)]
res = times[0].to_bytes(8, byteorder='big')
for t in times[1:]:
    res = bytes(a ^ b for (a, b) in zip(res, t.to_bytes(8, byteorder='big')))
print(res.hex())

0000000000012a48


# Tasks: Testing (voluntary)
These tasks can be viewd as BONUS tasks hence should be done at the end (after Tasks in C) of the seminar or at home. They provide some insight to required properties of the PRNG output. Also, you learn basics of hypothesis testing that can be used also in other areas (not only in cryptography).

# Task 3: Basic testing (Frequency test)
Use the following test of randomness test (called Frequency or Monobit) that analyzes whether frequencies of bits (0 or 1) is roughly equal. Use the `monobit` test and analyze the LCG generator and standard [`random()`](https://docs.python.org/3/library/random.html) that is based on the Mersenne Twister.  

In [5]:
def monobit(bytes):
    num_ones = 0
    for byte in bytes:
        for i in range(8):
            num_ones += (byte >> i) & 1
    n = len(bytes)*8
    Sobs = abs(n - 2*num_ones) / math.sqrt(n)
    p_val = math.erfc(Sobs / math.sqrt(2))
    return p_val

pass

# Question 5: Test interpretation
How to interpret the resulted $p$-value of the test (Monobit or other one) e.g., 0.0001 or 0.4? The interpretation is based on the following fact: $p$-value is uniformly distributed on [0,1] interval for a **good** RNG i.e. probability that we obtain $p$-value $\leq  0.01$ for good RNG is 1%. 

 1. How to interpret result $p$-value $ =10^{-10}$? 
 2. What is more likely for $p$-value $ =10^{-10}$: it is result of good or bad RNG?  
 
 See [wiki](https://en.wikipedia.org/wiki/P-value)  or [NIST STS documentation](https://nvlpubs.nist.gov/nistpubs/legacy/sp/nistspecialpublication800-22r1a.pdf).
  

# Task 4: Generalized Monobit test
Use implementation of monobit function and implement `monobit_i(values, i)` which focuses on `i`-th bit of generated values. It counts frequency of `i`-th bit within values in the `value` list. Find which of the bits (of 4 bytes) generated by the `gcc_old` is biased. 

In [15]:
def monobit_i(values, i):
    pass


# Task 5: Test for correlation of bits
Implement function that compute 4 frequencies of combination of bits on $i$-th and $j$-th position of generated values. Than use $\chi^2$ test to analyze that all frequencies should be equal for RNG. You can use `scipy python` modul that already implements [$\chi^2$ test](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.chisquare.html). 

 1. Check whether some bits ($i$-th, $j$-th) are correlated with the same generated value (i.e. $0\leq i,j \leq 32$)
 2. Check whether some bits ($i$-th, $j$-th) are correlated for two consecutive values i.e. for list of generated random values $rnd_0, rnd_1, rnd_2, rnd_3, \cdots$ check if ($i$-th bit of $rnd_0$ is correlated with $j$-th of $rnd_1$, $i$-th bit of $rnd_2$ is correlated with $j$-th of $rnd_3$, etc.).







In [8]:
def histogram(values, i, j):
    pass

def correlation_test(values):
    pass
