# PROBABILITY

### Jack's Birthday Hash 

https://brilliant.org/wiki/birthday-paradox/

You can find the information I will use here 

$ p(n)$ -  probability that at least two of the $n$ randomly selected  strings selected from $\{0,1\}^{11}$ will result with same hash value 

$ 2^{11} = 2048$ -  all possible values

$ p(n) = ({\frac{1}{2048}})^n $ 

$ p(n) = 1 - {(1 - \frac{1}{2048})}^n $  

Using taylor series approximation $ln(1-x) \approx -x $ for small $x$ 

What we are looking for is $p(n) \approx 0.5 $

$ 0.5 \approx 1 - {(1 - \frac{1}{2048})}^n $

$ - 0.5 \approx  - {(1 - \frac{1}{2048})}^n $

$  0.5 \approx   {(1 - \frac{1}{2048})}^n $

$ n*ln(1 - \frac{1}{2048}) ~ \approx ln(0.5) $ 

From approximation with Taylor series 

$ - \frac{n}{2048} \approx ln(0.5)$

$ n \approx -2048 \ln(0.5) = 2048 \ln(2) \approx 1420 $


In [6]:
import math 

U = 2**11 

round(U*math.log(2))

1420

### Jack's Birthday Confusion

Let $p(n)$ be the probability that in a set of $n$ randomly chosen people at least two share the same birthday. Then $1-p(n)$ is the probability that every single one of them has distinct birthdays. 

The number of ways to pick to pick n distinct birtdays from a set of 365 days is $365 \times 364 \times ... \times (366 - n)$ This is because each successive birthday has one fewer choice of days left. This is the numerator 

The number of possibilities for the birtdays of $n$ people is $365^n$ this is denominator 

$1 - p(n) = \frac{365 \times 364 \times ... \times (366 - n)}{365^n} = \frac{365!}{(365-n)!365^n} => p(n) = 1 - \frac{365!}{(365-n)!365^n}$

Now swap 365 with our hash space $U = 2**11$ 

$p_U(n) = 1 - \frac{U!}{(U-n)!U^n}$


Probability that no hash repeated after n samples: 

$\frac{U}{U} \times (1 - \frac{1}{U}) \times (1 - \frac{2}{U}) \times ... \times (1 - \frac{n-1}{U})$ 

If we approximate above with $(1-\frac{1}{U}) = e^{-\frac{1}{U}}$

$p_U(n) \approx 1 \times e^{-\frac{1}{U}} \times e^{-\frac{2}{U}} \times ... \times e^{-\frac{n-1}{U}} 
\approx e^{-\frac{1+2+...(n-1)}{U}} 
\approx e^{-\frac{n \times (n-1)}{2U}}
$

so probability that we will find is 
$p(n) \approx 1 - e^{-\frac{n^2}{2U}}$

For given probability $0.75$ we must find the inverse 

$n(p) = \sqrt{2U \ln(\frac{1}{1-p})}$



In [4]:
import math 

U = 2**11 
p = 0.75 

print(math.sqrt(2*U * math.log(1/(1-p))))

75.35424144099038


# COLLISIONS

### Collider

https://marc-stevens.nl/research/md5-1block-collision/md5-1block-collision.pdf

We use 2 collisions provided in paper above 

In [13]:
import telnetlib 
import json


HOST = 'socket.cryptohack.org' 
PORT = 13389 
tn = telnetlib.Telnet(HOST, PORT)


def readline():
    return tn.read_until(b"\n")

def json_recv():
    line = readline()
    return json.loads(line.decode())

def json_send(hsh):
    request = json.dumps(hsh).encode()
    tn.write(request)

m1 = "4d c9 68 ff 0e e3 5c 20 95 72 d4 77 7b 72 15 87\
d3 6f a7 b2 1b dc 56 b7 4a 3d c0 78 3e 7b 95 18\
af bf a2 00 a8 28 4b f3 6e 8e 4b 55 b3 5f 42 75\
93 d8 49 67 6d a0 d1 55 5d 83 60 fb 5f 07 fe a2"
m1 = m1.replace(" ", "")

m2 = "4d c9 68 ff 0e e3 5c 20 95 72 d4 77 7b 72 15 87\
d3 6f a7 b2 1b dc 56 b7 4a 3d c0 78 3e 7b 95 18\
af bf a2 02 a8 28 4b f3 6e 8e 4b 55 b3 5f 42 75\
93 d8 49 67 6d a0 d1 d5 5d 83 60 fb 5f 07 fe a2"
m2 = m2.replace(" ", "")


readline()
json_send({"document": m1})
print(json_recv()) 

json_send({"document": m2})
print(json_recv()) 


{'success': 'Document 008ee33a9d58b51cfeb425b0959121c9 added to system'}
{'error': 'Document system crash, leaking flag: crypto{m0re_th4n_ju5t_p1g30nh0le_pr1nc1ple}'}
