# Module 2 (Block ciphers)

### What are block ciperhs?

While stream ciphers operate on stream of bits:
> (formally stream cipher works on bits (0 or 1) but this stream_cipher works on bytes but the idea is shown in general)

In [45]:
def message_generator(message):
    for m in message:
        yield m

def prg_generator(seed):
    taps = [16, 14, 13, 11]
    lfsr = seed
    while True:
        bit = 0
        for t in taps:
            bit = bit ^ ((lfsr >> (16 - t)) & 0b1)
        lfsr = (lfsr >> 1) | (bit << 15)
        yield lfsr
        if lfsr == seed or lfsr == 0:
            break

def stream_cipher(message_gen, key_gen) -> list[bool]:
    return ''.join([chr(ord(m) ^ k) for (m, k) in zip(message_gen, key_gen)])

ptext = 'hello from omsk'
ctext = stream_cipher(message_generator('hello from omsk'), prg_generator(123))
print(ctext)
dtext = stream_cipher(message_generator(ctext), prg_generator(123))
print(dtext)

聕䁻⁣遫䡬ꐡ퉦楲㓯ᨭഀ蛿䌥ꇗ傹
hello from omsk


But the Block Ciphers works on Blocks. There are two main blocks in block cipher:
- input/output block of size n
- key block of size k

In [46]:
# for AES128
N = 128
K = 128
def block_cipher(ptext: list[bool], key: list[bool]) -> list[bool]:
    assert(len(ptext) == N)
    assert(len(key) == K)

    ks = key_expansion(key)
    # for AES128 # of key (rounds) is 10
    assert(len(ks) == 10)

    m = ptext
    for k in ks:
        m = round_function(k, m)
    
    return m

# PRF? PRG?
def round_function(k: list[bool], m: list[bool]) -> list[bool]:
    pass

def key_expansion(key: list[bool]) -> list[list[bool]]:
    pass

##### PRF and PRP
1. PRF is a Pseudo Random Function:
   - E : K -> X -> Y
2. PRP is a Pseudo Random Permutation:
    - E is PRP
    - E : K -> X -> X
    - E is efficient and deterministic algorithm
    - E is one to one. E is reversible. E is bijective
    - reverse(E) is efficient algorithm

> PRP is a PRF but PRP output is X (input space) which means that PRP is invertible PRF


Using PRG you can easily construct PRG:
- assume that we got PRF F that is secure PRF
- that F : X -> K -> Y. Y is {0, 1} ^ n
- then our PRG : K -> {0, 1} ^ nt = G(k, 0) || G(k, 1) || ... || G(k, t)
- and that is secure PRG
- secure because if F is secure, then F can be replaces by truly random funciton f (which provides truly random sequence of n bits)
- and then PRG is f(0) || f(1) || ... which is truly random


### The DES (data encryption standard)

The Feistel is the main guy behind DES. He also invent a Feistel Network algorithm:

In [47]:
def feistel_round(l, r, f):
    return (r, f(r) ^ l)

def feistel_network(input, f, N = 3):
    l, r = dword_decay(input)
    for i in range(N):
        l, r = feistel_round(l, r, f(i))
    return dword_bound(l, r)

def dword_decay(n):
    return ((n >> 32) & 0xFFFFFFFF, n & 0xFFFFFFFF)

def dword_bound(l, r):
    return ((l & 0xFFFFFFFF) << 32 | r & 0xFFFFFFFF) 

def dword_swapped(n):
    l, r = dword_decay(n)
    return (r << 32 | l)

n = 4
f = lambda i: lambda x: x * 33911 % 482901

n0 = feistel_network(n, f)
n1 = dword_swapped(feistel_network(dword_swapped(n0), f))

print(n0, n1)


823186317099862 4


DES algorithm is 16 round feistel network (N = 16) with 64 bits input/output 

Each round function f_i (where 0 <= i < N) is simply:
```raw
f_i = F(k_i, x)
```

In other words same F just used with different keys

DES key length is 56 bits and from this 56 bit key somehow derived 16 48 bit keys (k_i) (48 = 32 + 16)

In [48]:
# expand x from 32 bits to 48
# via bits shuffling and copying and etc
def expand_x(x):
    return ((x << 32) | x) & 0xFFFFFFFFFFFF

# S box takes x as 6 bit value and map it to 4 bit value via LUT
def S(i, x):
    return x & 0b1111

def F(k_i, x):
    ex = expand_x(x)
    # 48 bit value
    ex_xor_k = ex ^ k_i
    out = 0
    for i in range(8):
        S_out = S(i, (ex_xor_k >> (i * 6) & 0b111111))
        out = out | (S_out << (i * 4))

    return out

# return 16 48 bit keys
def key_expansion(k, inverse=False):
    ks = []
    rangee = reversed(range(8)) if inverse else range(8)
    for i in rangee:
        k_i = (k << i & 0xFFFFFFFFFFFF)
        k_i_bytes = bytearray()
        for j in range(6):
            k_i_bytes.append(k_i << (j * 8) & 0xFF)
        if inverse:
            ks.append(int.from_bytes(k_i_bytes, byteorder='big'))
            ks.append(int.from_bytes(k_i_bytes, byteorder='little'))
        else:
            ks.append(int.from_bytes(k_i_bytes, byteorder='little'))
            ks.append(int.from_bytes(k_i_bytes, byteorder='big'))
    return ks

# sizeof(k) = 56 bits; sizeof(p) = 64 bits
def DES(k, p, encrypt=True):
    N = 16
    ks = key_expansion(k, inverse=not encrypt)
    if not encrypt:
        p = dword_swapped(p)
    c = feistel_network(p, lambda i: lambda x: F(ks[i], x), N)
    if not encrypt:
        c = dword_swapped(c)
    return c

k = 0xF3A87C31
p = 123
c = DES(k, p)
dc = DES(k, c, False)
print(p, c, dc)

123 50384561271821664 123


If S-box is linear function, for example it xors its bits in some order:
```raw
o0 = i0 ^ i3
o1 = i1 ^ i5
o2 = i2 ^ i4
o3 = i2 ^ i3 ^ i5
```

then whole S-box can be represented as vector matrix multiplication mod 2, where:
```raw
| 1 0 0 1 0 0 |   |i0|   |o0|
| 0 1 0 0 0 1 | * |i1| = |o1|
| 0 0 1 0 1 0 |   |i2|   |o2|
| 0 0 1 1 0 1 |   |i3|   |o3|          
                  |i4|     
                  |i5|     
```

And furthermore the whole DES can be represented as matrix multiplication where there is:
- matrix B with 832 x 64
- vector X with 1 x 832 [m..., k0..., k1..., ..., k15...] expanded bits of message (64) and 16 * 48 bits of keys
- B * X is the C which is the output of DES

In that case there is huge vulnerability for DES:
```raw
DES(k, x) = B * [x..., k...]
DES(k, x0) ^ DES(k, x1) ^ DES(k, x2) = DES(k, x0 ^ x1 ^ x2)
```

which is a huge deal because there is no semantic secrecy for such DES since given DES(k, x0) and DES(k, x1) you can distringuish them:
```raw
Got x0 and x1
Gives x0, x1 and x0 ^ x1
Get one of DES(k, x0), DES(k, x1) or DES(k, x0 ^ x1)
Received all of DES outputs can map DES output to source x since DES(k, x0 ^ x1) = DEK(k, x0) ^ DES(k, x1)
```

And to avoid this potential vulnerability the **S boxes must NOT BE LINEAR**

### Exhaustive search attacks
Attack:
- given c and m where c = DES(k, m)
- find k knowing c and m

Q: how many such k are there. We know that m is encrypted using the specific key, but is there only one key k such that c = DES(k, m)?\
A: 
- suppose DES is an ideal cipher
- then DES is a collection of random invertible functions f: {0, 1}^64 -> {0, 1}^64 and the size of this collection is 2^56 (because there is 2^56 keys)
- we got k such that c = DES(k, m)
- there are 2^56 - 1 keys left, lets call them K'
- what is the probability that there is k' from K' such that c = DES(k, m) = DES(k', m)?

More formally:
```raw
Pr [DES(k, m) = DES(k', m)]

m -> DES(*, m) -> 2^56 different c's

DES(k) defines a permutation P. sizeof(P) = 2^64. P is uniformly distributed. All elements in P are unique (since DES(k) is invertible which means that DES(k, m) is bijective function one to one mapping). And encryption using DES is just sampling this permutation DES(k) at index m or DES(k)[m]

DES(*) defines 2^56 such permutations so we can even simplify:
DES = list[list[int]]
DES_encrypt(k, m) => DES[k][m]

In that case Pr [DES(k, m) = DES(k', m)] = 1/2^64 (sample DES(k')[m] = c with probability = 1/2^64)
But there are 2^56 such k'
So probability that there are such k' in K' is (2^56 - 1) * (1 / 2^64) = 1/256
```

**EACH MATHEMATICAL DISCRETE FUNCTION IS A PERMUTATION P** where sizeof(P) = sizeof(input space) and sizeof(uniques(P)) = sizeof(output space)

Each f :: input -> output defines a permutation P_f :: output[sizeof(input)] and\
f becomes f x = P_f[x]

In [49]:
def DES_P(k) -> list[int]:
    xs = [] # range(2^64)
    shuffle(xs, k) # shuffle xs based on k
    return xs

def DES_encrypt(k, m):
    return DES_P(k)[m]

- DES is invertible since its a feistel_network
- is lambda x: F(k_i, x) must be invertible? no because feistel_network is invertible with any F. And in fact F uses S boxes and S boxes are not invertible because it maps 2^6 into 2^4!!!
- but feistel_round itself is invertible with any F since it uses xor and that's the property of feistel_network -- its invertible with any F

Q: Is feistel_round invertible?\
A: YES!

In [50]:
n = 10061999
f = lambda x: dword_bound(*feistel_round(*dword_decay(x), lambda x: x + 123))
f_inv = lambda x: dword_swapped(dword_bound(*feistel_round(*dword_decay(dword_swapped(x)), lambda x: x + 123)))

print(f(n), f_inv(f(n)))

43215956647446826 10061999


To summarize:
- given c = DES(k, m) and m the probability that there is only one such k is 1 - 1/256
- given c0, c1 and m0, m1 probability that there is only one such k is 1 - 1/2^71
- that facts means that if you got only one pair (c, m) the key is completely determined (its safe to say that there is only one such k)

The attack where the (c, m) are known and you derive k from it is called **exhaustive key search**

The DES is completely breaked:
- there are attacks that can break the DES encryption in 22 hours
- there are specific computers that can do that in less then a week

The simplest and "dumbest" solution is 3DES:

In [51]:
# sizeof(k) = 168 bits
def DES3(k, m):
    bits_48_mask = 0x111111111111111111111111111111111111111111111111
    k0 = k >> 48 * 0 & bits_48_mask
    k1 = k >> 48 * 1 & bits_48_mask
    k2 = k >> 48 * 2 & bits_48_mask
    return DES(k2, DES(k1, DES(k0, m, True), False), True) # E(D(E))

Q: I wonder is the fact that DES output the same c for the same m is a vulnerability because the?

In [None]:
text = 'attack 0attack 0'
ctext = ''

def pack2long(s, N=64):
    assert(N % 8 == 0)
    bytes = [ord(c) for c in s[:N//8]]
    l = 0
    for i, b in enumerate(bytes):
        l = l | (b << i * 8)
    return l

def unpack2str(l, N = 8):
    s = []
    for i in range(N):
        s.append(chr((l >> i * 8) & 0xFF))
    return ''.join(s)

k = 0x589dea13bc
text_bytes = [pack2long(text[0:8]), pack2long(text[8:])]
ctext_bytes = [DES(k, m) for m in text_bytes]
# THE CTEXT_BYTES ARE THE SAME. ISN'T THAT BAD FOR CIPHER?
print(ctext_bytes)
ectext_bytes = [DES(k, c, False) for c in ctext_bytes]
ectext = ''.join([unpack2str(ec) for ec in ectext_bytes])

print(ectext)

[7319812402332998806, 7319812402332998806]
attack 0attack 0


##### Double DES Attack
There is a 3DES, but 2DES doesn't improve DES security at all.
```raw
2DES (k0, k1, m) = E(k1, E(k0, m))
```

We need to find the (k0, k1), so:
```raw
E(k1, E(k0, m)) = c <=>    # O(n^2) problem because of the nested E?
E(k0, m) = D(k1, c)        # O(2 * n) problem because of non nested E?
```

We got some number of ms and cs:
```raw
M = (m0, m1, ..., m9)
C = (c0, c1, ..., c9)
```

And whole sceheme is like this:
```raw
m -> E(k0) -> E(k1) -> c
```

To break the 2DES compute the table t0:
```python
t0 = [E(k, M) for k in K]
# t0 is list[list[int]] M is list[int]
```
then compute t1:
```python
t1 = [D(k, C) for k in K]
```

And since E(k0, m) = D(k1, c) find the intersecitons between t0 and t1


### AES

In [53]:
# sizeof(k) = 16 bytes
# sizeof(m) = 16 bytes
def AES_128(k, m):
    ks = AES_key_expansion(k)
    for i in range(len(m)):
        m[i] ^= ks[0][i]
    for i, k in enumerate(ks[1:]):
        m = AES_round(i, k, m)
    return m

def AES_round(i, k_i, m):
    if False and i == 9: # last round
        m = AES_ShiftRows(AES_ByteSub(m))
    else:
        m = AES_MixColumns(AES_ShiftRows(AES_ByteSub(m))) 
    return [(mi ^ ki) for mi, ki in zip(m, k_i)]

# S box byte -> byte
# sizeof(xs) is 16 bytex: 4x4 byte matrix
def AES_ByteSub(xs):
    LUT = [i for i in range(1 << 8)] # id for testing purposes
    return [LUT[x] for x in xs]

def AES_ShiftRows(xs):
    out = [0 for _ in xs]
    for y in range(4):
        for x in range(4):
            out[y * 4 + x] = xs[y * 4 + (x + y) % 4]
    return out

def AES_MixColumns(xs):
    M = [15, 221, 21, 103, 12, 170, 94, 223, 108, 42, 49, 70, 71, 225, 149, 79]
    out = byte_matmul(M, xs)
    return out

def byte_matmul(A, B):
    C = [0] * 16
    for B_x in range(4):
        for A_y in range(4):
            for i in range(4):
                C[A_y * 4 + B_x] += (A[A_y * 4 + i] * B[i * 4 + B_x])
                C[A_y * 4 + B_x] &= 0xFF
    return C

# sizeof(k) = 16 bytes
# sizeof(out) = 11 keys; each key is 16 bytes
def AES_key_expansion(ks):
    return [[(k + (1 << i)) & 0xFF for k in ks] for i in range(11)]

key = [ord(c) for c in unpack2str(0x31278597329485683127859732948568, N=16)]
AES_128(key, [ord(c) for c in 'l sdgj 1dl ld ds'])


[178, 94, 146, 167, 235, 126, 198, 125, 229, 51, 27, 168, 225, 226, 29, 36]

There are aes hardware instructions on AMD and Intel CPUs:
- amdenc, amdenclast takes 2 xmm registers (128 bits). they perform the shuffle, shift and mix at the hardware level

Also worth mentioning that there is a lot to precompute in AES which can drastically increase the speed of enc/dec

### Block ciphers from PRG

Can we build a PRF from a PRG?

```raw
G : K -> K^2
F : K x {0, 1} -> K

F(K, x)= G(K)[x]
```

In [54]:
def bbs_gen(p, q, seed):
    M = p * q
    x = seed
    while True:
        x = x ** 2 % M
        yield x % 2

def bbs_int(p, q, seed, N = 32):
    gen = bbs_gen(p, q, seed)
    while True:
        out = 0
        for i in range(N):
            out |= next(gen) << i
        yield out

# just apply prg's in chain based on control bit of x
def bbs_prf(p, q, key, x):
    gen = bbs_int(p, q, key)
    rs = [next(gen), next(gen)]
    while True:
        r = rs[x % 2]
        x //= 2
        if x <= 0:
            return r
        gen = bbs_int(p, q, r)
        rs = [next(gen), next(gen)]

bbs_prf(2147483647, 4294967291, 4282389123, 6)


410748765

The main idea is that when you got a PRG that outputs double of its key size, then you can build PRF which outputs {0,1}^n for any n

By constructing PRF from PRG we can plug this PRF into feistel network and got a PRP which is a block cipher:\
PRG -> PRP

```raw
let X = {0, 1}
Perms[X] = {id, not}
E(k, x) = k xor x
E(0, 0) = 0 xor 0 = 0
E(0, 1) = 0 xor 1 = 1
E(1, 0) = 1 xor 0 = 1
E(1, 1) = 1 xor 1 = 0

E(0, x) = id x
E(1, x) = not x

thus E(k, x) = k xor x is secure PRP if X = {0, 1} and K = {0, 1} and Perms[X] = {id, not}
```

### Modes of operation (one time key)

In previous section I mentioned that if you use the same k for AES or DES, then if m0 == m1 then c0 == c1 which break semantic security of a cipher.

If you're bound to that one key, then you should use your PRP (AES or DES) in Determined counter mode, which converts PRP into a stream cipher:

In [55]:
from math import *

def DES_DCM(k, m, encrypt = True):
    qwords = []
    if encrypt:
        for i in range(ceil(len(m) / 8)):
            qwords.append(pack2long(m[i * 8:]))
    else:
        qwords = m
    
    c = [0] * len(qwords)
    for i, q in enumerate(qwords):
        c[i] = DES(k, i) ^ q
    if encrypt:
        return c
    else:
        return ''.join([unpack2str(ci) for ci in c])

k = 0x589dea13bc
m = 'hello priver holla'
c = DES_DCM(k, m)
print(c)
ce = DES_DCM(k, c, False)
print(ce)

[8246139601914979692, 7795559470664611436, 36042172620890474]
hello priver holla      



When using PRP under DCM we just use PRF to generate key for OTP

### Modes of operation (many time key)

There is a CPA attack (chosen plaintext attack). Its used when E provide the same c for the same m. In that case CPA can break semantic security:
- adv sends m0 = m1 = m to challenger
- challenger sends back c0 = c1 = c since E output same c for same m (this is the case when we use DES with the same key)
- now adv sends m0, m2
- challenger sends c0, c2
- since adv knows the c0 = E(k, m0) he can clearly see which cipher text he reives

To fix that EVERY encryption block should have some new value. One of the approaches to face CPA is randomized encryption.

Randomized encryption:
- get random r for each encryption block
- output it with encrypted data

In [56]:
import time

def str2qwords(s):
    qwords = []
    for i in range(ceil(len(s) / 8)):
        qwords.append(pack2long(s[i * 8:]))
    return qwords

def DES_randomized_encryption(k, m):
    qwords = str2qwords(m) 
    c = [(0, 0)] * len(qwords)
    gen = bbs_int(2147483647, 4294967291, int(time.time()), N = 56)
    for i, q in enumerate(qwords):
        r = next(gen)
        c[i] = (r, DES(k, r) ^ q)
    return c

DES_randomized_encryption(k, m)


[(35339880014975217, 16900939606086206130),
 (62813615862002403, 13029440436982173608),
 (50978495973431847, 8392002293671879811)]

In that case CPA is impossible because even if adv sends m0 = m1 = m, the challenger will use unique r for each encryption and c0 != c1. And since r must be truly random, then such PRF is semantically secure even under CPA.

There is also **nonce-based encryption**. Its very simillar to determined counter mode, butt counter is now stored between multiple messages:

In [57]:
def DES_nonce(k, m):
    qwords = str2qwords(m)
    
    c = [0] * len(qwords)
    for i, q in enumerate(qwords):
        c[i] = DES(k, DES_nonce.nonce) ^ q
        DES_nonce.nonce += 1
    return c

DES_nonce.nonce = 0

c = DES_nonce(k, m)
print(c)

[8246139601914979692, 7795559470664611436, 36042172620890474]


In [58]:
DES_nonce(k, 'hello words is that all?')

[8049951530262750575, 8367828043602027634, 4552119204516749673]

### Cipher Block Chaining (CBC)

It's also a many time key mode of operation that's used for protection from CPA:

In [65]:
def CBC_E(E, k, m):
    init_vector = next(bbs_int(2147483647, 4294967291, int(time.time() * 100), N=64))
    qwords = str2qwords(m)
    c = [0] * len(qwords)
    x = init_vector
    for i, q, in enumerate(qwords):
        x = E(k, q ^ x)
        c[i] = x
    return (init_vector, c)

def CBC_D(D, k, ivc):
    iv, cs = ivc
    x = iv
    ms = [0] * len(cs)
    for i, c in enumerate(cs):
        ms[i] = x ^ D(k, c)
        x = c
    return ''.join([unpack2str(m, N=64) for m in ms])

iv, c = CBC_E(DES, k, 'privet holla salute hello bonjour')
print(c)
c[1] = (~c[1]) & 0xFFFFFFFFFFFFFFFF
ec = CBC_D(lambda k, x: DES(k, x, encrypt=False), k, (iv, c))
print(ec)
'privet h salßo bonjour'

[5072356290472880772, 6925787750633788932, 8207471092584713694, 12410530623676972098, 3127553852583742413]
privet h                                                         sal                                                        ß                                                        o bonjou                                                        r                                                               


'privet h\x90\x93\x93\x9e sal\x8a\x8b\x9aß\x97\x9a\x93\x93o bonjour'

CBC theorem states that:
```raw
CBC Advantage <= 2 * CBC_PRP Advantage + CBC_error_term

CVC_error_term = 2 * (# of messages encrypted with k)^2 * (length of max message)^2 / |input_space|
               = 2 * q^2 * L^2 / |X|
```

Since we use semantically secure PRP, then CVC_PRP Advantage is very very small.

For block ciphers the L is length of the block (for DES its 64 bits).

The input_space for DES is {0,1}^64 so length of it = 2^64

So if we want CBC Advantage <= 1/2^32 (which is very small number),\
then CBC_error_term < 1/2^32
```raw
2 * q^2 * 8^2 / 2^64 = 
q^2 * 2^7 / 2^64 = 
q^2 / 2^57 < 1 / 2^32
q^2 < 2^25
q < 2^12.5 = 2^12
```

Which means that after 2^12 encyrpted cipher blocks, the CBC_DES error term becomes large enough thus breaks semantic security

But if attacker can predict IV for the next encryption, then CBC is no longer semantically secure!

### PKCE Padding scheme

If message length isn't divisible by block length, then message needs padding:

In [60]:
def PKCE_pad(m, N = 64):
    assert(N % 8 == 0)
    bytes_in_block = N // 8
    qwords = []
    bytes_count = len(m)
    full_blocks = bytes_count // bytes_in_block
    pad = bytes_in_block
    full_blocks += 1
    if len(m) % bytes_in_block != 0:
        pad = full_blocks * bytes_in_block - bytes_count
    print(pad)

    for i in range(full_blocks):
        if i == full_blocks - 1:
            qwords.append(pack2long([c for c in m[i*8:]] + [chr(pad)] * pad, N=N))
        else:
            qwords.append(pack2long(m[i * 8:], N=N))
    return qwords

[[ord(c) for c in unpack2str(m)] for m in PKCE_pad('hello world')]

5


[[104, 101, 108, 108, 111, 32, 119, 111], [114, 108, 100, 5, 5, 5, 5, 5]]

se we pad the message with the bytes, which equals to the amount of padded bytes!

if message doesn't need padding since its divisible by block size, then we add full block of padding

### CTR

CTR is something between CBC and DCM:
- chose absolute random IV
- use this IV as the initial value of a counter for DCM

Also CBC is forces to use PRP, because if it uses PRF then you're unable to decrypt (only PRP is reversible),\
while CTR can use PRF because its just OTP with key computed using IV counter and PRF

Also CBC isn't parallelizable, while CTR being a OTP can be parallelized

Also CTR doesn't require message padding because its basically OTP, while CBC need a full block to encrypt it which requires padding for smaller messages