# PROBABILITY

### Jack's Birthday Hash 

https://brilliant.org/wiki/birthday-paradox/

You can find the information I will use here 

$ p(n)$ -  probability that at least two of the $n$ randomly selected  strings selected from $\{0,1\}^{11}$ will result with same hash value 

$ 2^{11} = 2048$ -  all possible values

$ p(n) = ({\frac{1}{2048}})^n $ 

$ p(n) = 1 - {(1 - \frac{1}{2048})}^n $  

Using taylor series approximation $ln(1-x) \approx -x $ for small $x$ 

What we are looking for is $p(n) \approx 0.5 $

$ 0.5 \approx 1 - {(1 - \frac{1}{2048})}^n $

$ - 0.5 \approx  - {(1 - \frac{1}{2048})}^n $

$  0.5 \approx   {(1 - \frac{1}{2048})}^n $

$ n*ln(1 - \frac{1}{2048}) ~ \approx ln(0.5) $ 

From approximation with Taylor series 

$ - \frac{n}{2048} \approx ln(0.5)$

$ n \approx -2048 \ln(0.5) = 2048 \ln(2) \approx 1420 $


In [6]:
import math 

U = 2**11 

round(U*math.log(2))

1420

### Jack's Birthday Confusion

Let $p(n)$ be the probability that in a set of $n$ randomly chosen people at least two share the same birthday. Then $1-p(n)$ is the probability that every single one of them has distinct birthdays. 

The number of ways to pick to pick n distinct birtdays from a set of 365 days is $365 \times 364 \times ... \times (366 - n)$ This is because each successive birthday has one fewer choice of days left. This is the numerator 

The number of possibilities for the birtdays of $n$ people is $365^n$ this is denominator 

$1 - p(n) = \frac{365 \times 364 \times ... \times (366 - n)}{365^n} = \frac{365!}{(365-n)!365^n} => p(n) = 1 - \frac{365!}{(365-n)!365^n}$

Now swap 365 with our hash space $U = 2**11$ 

$p_U(n) = 1 - \frac{U!}{(U-n)!U^n}$


Probability that no hash repeated after n samples: 

$\frac{U}{U} \times (1 - \frac{1}{U}) \times (1 - \frac{2}{U}) \times ... \times (1 - \frac{n-1}{U})$ 

If we approximate above with $(1-\frac{1}{U}) = e^{-\frac{1}{U}}$

$p_U(n) \approx 1 \times e^{-\frac{1}{U}} \times e^{-\frac{2}{U}} \times ... \times e^{-\frac{n-1}{U}} 
\approx e^{-\frac{1+2+...(n-1)}{U}} 
\approx e^{-\frac{n \times (n-1)}{2U}}
$

so probability that we will find is 
$p(n) \approx 1 - e^{-\frac{n^2}{2U}}$

For given probability $0.75$ we must find the inverse 

$n(p) = \sqrt{2U \ln(\frac{1}{1-p})}$



In [4]:
import math 

U = 2**11 
p = 0.75 

print(math.sqrt(2*U * math.log(1/(1-p))))

75.35424144099038


# COLLISIONS

### Collider

https://marc-stevens.nl/research/md5-1block-collision/md5-1block-collision.pdf

We use 2 collisions provided in paper above 

In [1]:
import telnetlib 
import json


HOST = 'socket.cryptohack.org' 
PORT = 13389 
tn = telnetlib.Telnet(HOST, PORT)


def readline():
    return tn.read_until(b"\n")

def json_recv():
    line = readline()
    return json.loads(line.decode())

def json_send(hsh):
    request = json.dumps(hsh).encode()
    tn.write(request)

m1 = "4d c9 68 ff 0e e3 5c 20 95 72 d4 77 7b 72 15 87\
d3 6f a7 b2 1b dc 56 b7 4a 3d c0 78 3e 7b 95 18\
af bf a2 00 a8 28 4b f3 6e 8e 4b 55 b3 5f 42 75\
93 d8 49 67 6d a0 d1 55 5d 83 60 fb 5f 07 fe a2"
m1 = m1.replace(" ", "")

m2 = "4d c9 68 ff 0e e3 5c 20 95 72 d4 77 7b 72 15 87\
d3 6f a7 b2 1b dc 56 b7 4a 3d c0 78 3e 7b 95 18\
af bf a2 02 a8 28 4b f3 6e 8e 4b 55 b3 5f 42 75\
93 d8 49 67 6d a0 d1 d5 5d 83 60 fb 5f 07 fe a2"
m2 = m2.replace(" ", "")


readline()
json_send({"document": m1})
print(json_recv()) 

json_send({"document": m2})
print(json_recv()) 


  import telnetlib


{'success': 'Document 008ee33a9d58b51cfeb425b0959121c9 added to system'}
{'error': 'Document system crash, leaking flag: crypto{m0re_th4n_ju5t_p1g30nh0le_pr1nc1ple}'}


### Hash Stuffing

We can invert this

In [16]:
from pwn import *
import json 
import re
# 2^128 collision protection!
BLOCK_SIZE = 32

# Nothing up my sleeve numbers (ref: Dual_EC_DRBG P-256 coordinates)
W = [0x6b17d1f2, 0xe12c4247, 0xf8bce6e5, 0x63a440f2, 0x77037d81, 0x2deb33a0, 0xf4a13945, 0xd898c296]
X = [0x4fe342e2, 0xfe1a7f9b, 0x8ee7eb4a, 0x7c0f9e16, 0x2bce3357, 0x6b315ece, 0xcbb64068, 0x37bf51f5]
Y = [0xc97445f4, 0x5cdef9f0, 0xd3e05e1e, 0x585fc297, 0x235b82b5, 0xbe8ff3ef, 0xca67c598, 0x52018192]
Z = [0xb28ef557, 0xba31dfcb, 0xdd21ac46, 0xe2a91e3c, 0x304f44cb, 0x87058ada, 0x2cb81515, 0x1e610046]

# Lets work with bytes instead!
W_bytes = b''.join([x.to_bytes(4,'big') for x in W])
X_bytes = b''.join([x.to_bytes(4,'big') for x in X])
Y_bytes = b''.join([x.to_bytes(4,'big') for x in Y])
Z_bytes = b''.join([x.to_bytes(4,'big') for x in Z])

def pad(data):
    padding_len = (BLOCK_SIZE - len(data)) % BLOCK_SIZE
    return data + bytes([padding_len]*padding_len)

def blocks(data):
    return [data[i:(i+BLOCK_SIZE)] for i in range(0,len(data),BLOCK_SIZE)]

def xor(a,b):
    return bytes([x^y for x,y in zip(a,b)])

def rotate_left(data, x):
    x = x % BLOCK_SIZE
    return data[x:] + data[:x]

def rotate_right(data, x):
    x = x % BLOCK_SIZE
    return  data[-x:] + data[:-x]

def scramble_block(block):
    for _ in range(40):
        block = xor(W_bytes, block)
        block = rotate_left(block, 6)
        block = xor(X_bytes, block)
        block = rotate_right(block, 17)
    return block

def cryptohash(msg):
    initial_state = xor(Y_bytes, Z_bytes)
    msg_padded = pad(msg)
    msg_blocks = blocks(msg_padded)
    for i,b in enumerate(msg_blocks):
        mix_in = scramble_block(b)
        for _ in range(i):
            mix_in = rotate_right(mix_in, i+11)
            mix_in = xor(mix_in, X_bytes)
            mix_in = rotate_left(mix_in, i+6)
        initial_state = xor(initial_state,mix_in)
    return initial_state.hex()


def unscramble(block):
    for _ in range(40):
        block = rotate_left(block, 17)
        block = xor(X_bytes, block)
        block = rotate_right(block, 6)
        block = xor(W_bytes, block)
    
    return block 


  
c = remote('socket.cryptohack.org', 13405) 
print(c.recvline().decode())
print(c.recvline().decode())

msg1 = X_bytes
msg2 = Y_bytes

target = cryptohash(msg1)
state = cryptohash(msg2)

mix_in = xor(bytes.fromhex(target), bytes.fromhex(state))
mix_in = rotate_right(mix_in, 7)
mix_in = xor(mix_in, X_bytes)
mix_in = rotate_left(mix_in, 12)

msg2_block = unscramble(mix_in)
msg2 = msg2+ msg2_block

request = json.dumps({'m1': msg1.hex(), 'm2': msg2.hex()})
c.sendline(request.encode())

response = c.recvline().decode()
print(response)


[x] Opening connection to socket.cryptohack.org on port 13405
[x] Opening connection to socket.cryptohack.org on port 13405: Trying 134.122.111.232
[+] Opening connection to socket.cryptohack.org on port 13405: Done
Can you help beta test our new CryptoHash? If you find a collision, we'll give you a flag!



Please send two hex encoded messages m1, m2 formatted in JSON: {"flag": "Oh no! Looks like we have some more work to do... As promised, here's your flag: crypto{Always_add_padding_even_if_its_a_whole_block!!!}"}



### PriMeD5

# HASH-BASED CRYPTOGRAPHY

### Merkle Trees

In [17]:
import ast 
from hashlib import sha256
from Crypto.Util.number import long_to_bytes


def hash256(data):
    return sha256(data).digest()

def merge_nodes(a, b):
    return hash256(a+b)


data = [] 
with open("data/merkle_tree.txt") as f: 
    lines = f.readlines() 
    for l in lines: 
        data.append(ast.literal_eval(l))


bitstring = ""
for datum in data:  
    a = bytes.fromhex(datum[0]) 
    b = bytes.fromhex(datum[1])
    c = bytes.fromhex(datum[2])
    d = bytes.fromhex(datum[3])
    root = bytes.fromhex(datum[4])
    bitstring+=str(int(root == merge_nodes(merge_nodes(a,b), merge_nodes(c,d))))

print(long_to_bytes(int(bitstring,2)).decode())

crypto{U_are_R3ady_For_S4plins_ch4lls}
