# Practical Activity 2: Symmetric Cryptography

## 🎯 Objectives
- Implement examples of **symmetric cryptography** in Python.
- Understand **block ciphers** vs **stream ciphers**.
- Use **DES** and **AES** (ECB/CBC/CTR/GCM) with `pycryptodome`.
- Solve and **explain** core exercises about modes, IVs, and keystream reuse.

---
## 1. Introduction
Symmetric cryptography uses the **same secret key** for encryption and decryption. There are two broad families:
- **Block ciphers** (operate on fixed-size blocks, e.g., DES with 64-bit blocks, AES with 128-bit blocks).
- **Stream ciphers** (generate a keystream and XOR it with plaintext, e.g., ChaCha20, Salsa20).

**DES** is historically important but **insecure today** due to its short key (56 bits). **AES** is the modern standard and remains secure when used properly with secure **modes of operation** and parameters.


### Environment setup
This notebook expects **pycryptodome**. If needed, install it in your environment:
```bash
pip install pycryptodome
```


In [1]:
# Imports and helpers
import os, math, secrets, textwrap
from collections import Counter
from binascii import hexlify

try:
    from Crypto.Cipher import DES, AES, ChaCha20
    from Crypto.Random import get_random_bytes
except Exception as e:
    raise RuntimeError("pycryptodome is required: pip install pycryptodome") from e

def pkcs7_pad(data: bytes, block_size: int) -> bytes:
    pad_len = block_size - (len(data) % block_size)
    return data + bytes([pad_len])*pad_len

def pkcs7_unpad(data: bytes) -> bytes:
    if not data:
        raise ValueError("Empty input")
    pad_len = data[-1]
    if pad_len < 1 or pad_len > len(data):
        raise ValueError("Invalid padding")
    if data[-pad_len:] != bytes([pad_len])*pad_len:
        raise ValueError("Invalid padding")
    return data[:-pad_len]

def chunks(b: bytes, size: int):
    for i in range(0, len(b), size):
        yield b[i:i+size]

def describe_blocks(b: bytes, block_size: int):
    blks = [b[i:i+block_size] for i in range(0, len(b), block_size)]
    uniq = len(set(blks))
    return len(blks), uniq


## 2. Block Ciphers
We'll demonstrate **DES** and **AES** in several modes, and empirically show why **ECB is insecure** (pattern leakage), while **CBC/CTR/GCM** provide better confidentiality (and GCM adds integrity).


### 2.1 DES (ECB and CBC)
- DES block size: 8 bytes (64 bits).
- DES key: 8 bytes (56 effective bits + 8 parity bits).
- **ECB** encrypts each block independently → identical plaintext blocks yield identical ciphertext blocks.
- **CBC** chains blocks with an **IV** → hides patterns.


In [2]:
# DES in ECB vs CBC
key_des = get_random_bytes(8)  # 8 bytes key for DES
plaintext_blocks = (b"HELLODES" * 8)  # 8*8 = 64 bytes, repeated identical blocks

# ECB
cipher_ecb = DES.new(key_des, DES.MODE_ECB)
ct_ecb = cipher_ecb.encrypt(plaintext_blocks)  # already block-aligned
total, unique_ = describe_blocks(ct_ecb, 8)
print("DES-ECB total blocks:", total, "unique blocks:", unique_)

# CBC with random IV
iv_des = get_random_bytes(8)
cipher_cbc = DES.new(key_des, DES.MODE_CBC, iv=iv_des)
ct_cbc = cipher_cbc.encrypt(plaintext_blocks)
total_cbc, unique_cbc = describe_blocks(ct_cbc, 8)
print("DES-CBC total blocks:", total_cbc, "unique blocks:", unique_cbc)

ct_ecb[:32], ct_cbc[:32]  # peek into first 32 bytes of each ciphertext

DES-ECB total blocks: 8 unique blocks: 1
DES-CBC total blocks: 8 unique blocks: 8


(b' \x85"\xff\xdf^Zi \x85"\xff\xdf^Zi \x85"\xff\xdf^Zi \x85"\xff\xdf^Zi',
 b'\x97\x90U\xb6L\xe4\x84\x02z\xd9\xb3\xe97\x8c\xf5\\?-\xef\xb8c\xc6\x03\xde\xba\xb0g&l0o%')

We expect **DES-ECB** to show **few unique blocks** (pattern leakage), while **CBC** should show **more unique blocks** due to chaining with the IV.


### 2.2 AES (ECB, CBC, CTR, GCM)
- AES block size: 16 bytes (128 bits). Keys: 16/24/32 bytes (AES-128/192/256).
- We'll demonstrate:
  - **ECB** (insecure),
  - **CBC** (needs random IV, provides confidentiality),
  - **CTR** (stream-like; needs unique nonce),
  - **GCM** (provides confidentiality **and** integrity via an authentication tag).


In [3]:
# AES-ECB: Pattern leakage demo
key_aes = get_random_bytes(16)  # AES-128
block = b"ABCDEFGHIJKLMNOP"  # 16 bytes
pt = block * 16  # many repeated blocks
cipher_ecb = AES.new(key_aes, AES.MODE_ECB)
ct_ecb = cipher_ecb.encrypt(pt)
total, unique_ = describe_blocks(ct_ecb, 16)
print("AES-ECB total blocks:", total, "unique blocks:", unique_)
ct_ecb[:32]

AES-ECB total blocks: 16 unique blocks: 1


b'\xac\xa14\xe9\xcc6\xc59\xa1]`\xdc\xe2X\x0e\x98\xac\xa14\xe9\xcc6\xc59\xa1]`\xdc\xe2X\x0e\x98'

In [4]:
# AES-CBC with PKCS7 padding
iv_cbc = get_random_bytes(16)
msg = b"ECB leaks patterns. CBC hides them by XORing with previous ciphertext and a random IV."
cipher_cbc = AES.new(key_aes, AES.MODE_CBC, iv=iv_cbc)
ct_cbc = cipher_cbc.encrypt(pkcs7_pad(msg, 16))
print("Ciphertext length (CBC):", len(ct_cbc))

# Decrypt & unpad
pt_rec = pkcs7_unpad(AES.new(key_aes, AES.MODE_CBC, iv=iv_cbc).decrypt(ct_cbc))
pt_rec

Ciphertext length (CBC): 96


b'ECB leaks patterns. CBC hides them by XORing with previous ciphertext and a random IV.'

In [5]:
# AES-CTR (stream-like). Never reuse (key, nonce)!
nonce_ctr = get_random_bytes(8)  # 64-bit nonce for CTR in pycryptodome
cipher_ctr = AES.new(key_aes, AES.MODE_CTR, nonce=nonce_ctr)
msg_ctr = b"CTR turns AES into a stream cipher by creating a keystream from counters."
ct_ctr = cipher_ctr.encrypt(msg_ctr)

# Decrypt
pt_ctr = AES.new(key_aes, AES.MODE_CTR, nonce=nonce_ctr).decrypt(ct_ctr)
pt_ctr

b'CTR turns AES into a stream cipher by creating a keystream from counters.'

In [6]:
# AES-GCM (authenticated encryption)
key_gcm = get_random_bytes(16)
aad = b"metadata: course=crypto101; activity=2"
cipher_gcm = AES.new(key_gcm, AES.MODE_GCM)
cipher_gcm.update(aad)
msg_gcm = b"GCM provides confidentiality and integrity (auth tag)."
ct_gcm, tag = cipher_gcm.encrypt_and_digest(msg_gcm)
nonce_gcm = cipher_gcm.nonce
print("GCM nonce:", hexlify(nonce_gcm))
print("GCM tag:", hexlify(tag))

# Verify
dec = AES.new(key_gcm, AES.MODE_GCM, nonce=nonce_gcm)
dec.update(aad)
pt_gcm = dec.decrypt_and_verify(ct_gcm, tag)
pt_gcm

GCM nonce: b'efcafd363b05f4cb8519e98d3d0e9912'
GCM tag: b'a016b615b566a4f3404d51304056a93d'


b'GCM provides confidentiality and integrity (auth tag).'

Tamper with the ciphertext or tag and **decryption will fail** in GCM, demonstrating **integrity protection**.


In [7]:
# GCM tampering demo (expect an exception)
bad_ct = bytearray(ct_gcm)
bad_ct[0] ^= 0x01  # flip a bit
try:
    dec_bad = AES.new(key_gcm, AES.MODE_GCM, nonce=nonce_gcm)
    dec_bad.update(aad)
    dec_bad.decrypt_and_verify(bytes(bad_ct), tag)
    print("[!] Unexpected: tampering not detected")
except Exception as e:
    print("Expected failure due to tampering:", type(e).__name__, str(e)[:80])

Expected failure due to tampering: ValueError MAC check failed


## 3. Stream Ciphers
We'll start with a **toy XOR keystream** example, then use **ChaCha20** from `pycryptodome`.

**Important**: Reusing the same keystream (same key+nonce) across messages is **catastrophic**. We'll demonstrate why.


In [8]:
# 3.1 Toy XOR stream cipher
def xor_bytes(a: bytes, b: bytes) -> bytes:
    return bytes(x ^ y for x, y in zip(a, b))

keystream = os.urandom(64)
m1 = b"Attack at dawn."
m2 = b"Meet at the river."
c1 = xor_bytes(m1, keystream)
c2 = xor_bytes(m2, keystream)

# Attacker computes c1 ^ c2 = m1 ^ m2; with a known-plaintext guess, can recover the other
xored = xor_bytes(c1, c2)
print("m1 ^ m2 (hex):", hexlify(xored))

# Suppose attacker knows m1; they can recover m2 as (c1 ^ c2) ^ m1
recovered_m2 = xor_bytes(xored, m1)
recovered_m2

m1 ^ m2 (hex): b'0c111115430a544100480141050758'


b'Meet at the riv'

### 3.2 ChaCha20 (stream cipher)
- ChaCha20 uses a **key** and a **nonce** to generate a keystream.
- Never reuse the same (key, nonce) pair.


In [9]:
# ChaCha20 encrypt/decrypt
key = get_random_bytes(32)  # 256-bit key
nonce = get_random_bytes(8)  # 64-bit nonce in pycryptodome's ChaCha20
cipher = ChaCha20.new(key=key, nonce=nonce)
msg = b"ChaCha20 is a secure stream cipher when nonces are never reused."
ct = cipher.encrypt(msg)
print("ciphertext (hex):", hexlify(ct)[:64], b"...")

cipher2 = ChaCha20.new(key=key, nonce=nonce)
pt = cipher2.decrypt(ct)
pt

ciphertext (hex): b'65f2c66d179a92db53ca8f4ffa8f0f9646b9d1d9b513283042d3382ddc8d4449' b'...'


b'ChaCha20 is a secure stream cipher when nonces are never reused.'

#### Keystream reuse attack with ChaCha20 (do **not** do this in practice)
If we (incorrectly) reuse the same (key, nonce) for two messages, the keystream repeats and the **XOR of ciphertexts** reveals the **XOR of plaintexts**.


In [10]:
key_r = get_random_bytes(32)
nonce_r = get_random_bytes(8)
m1 = b"THE PRICE IS 100 EUR"
m2 = b"THE PRICE IS 900 EUR"

# WRONG: reusing (key, nonce)
c1 = ChaCha20.new(key=key_r, nonce=nonce_r).encrypt(m1)
c2 = ChaCha20.new(key=key_r, nonce=nonce_r).encrypt(m2)

leak = xor_bytes(c1, c2)
print("c1^c2 (hex):", hexlify(leak))

# If attacker knows m1, they can derive m2
m2_recovered = xor_bytes(leak, m1)
m2_recovered

c1^c2 (hex): b'0000000000000000000000000008000000000000'


b'THE PRICE IS 900 EUR'

## 4. Exercises (Solved)

### 4.1 Show why AES-ECB is insecure
We'll encrypt a highly repetitive plaintext and count unique ciphertext blocks.


In [11]:
key = get_random_bytes(16)
pt = (b"PATTERN-LEAK-ECB!" * 32)  # 16 bytes repeated
ct = AES.new(key, AES.MODE_ECB).encrypt(pt)
total, unique_ = describe_blocks(ct, 16)
print({"total_blocks": total, "unique_blocks": unique_})
assert unique_ < total, "ECB should leak patterns (many identical blocks)."

{'total_blocks': 34, 'unique_blocks': 17}


### 4.2 AES-CBC: effect of the IV
We'll encrypt the **same message** twice with the **same key** but **different random IVs**, and show the ciphertexts differ completely.


In [12]:
key = get_random_bytes(16)
msg = b"Same message, different IVs -> different ciphertexts."
iv1, iv2 = get_random_bytes(16), get_random_bytes(16)
c1 = AES.new(key, AES.MODE_CBC, iv=iv1).encrypt(pkcs7_pad(msg, 16))
c2 = AES.new(key, AES.MODE_CBC, iv=iv2).encrypt(pkcs7_pad(msg, 16))
print("c1[:16] != c2[:16]?", c1[:16] != c2[:16])
pkcs7_unpad(AES.new(key, AES.MODE_CBC, iv=iv1).decrypt(c1))
pkcs7_unpad(AES.new(key, AES.MODE_CBC, iv=iv2).decrypt(c2))

c1[:16] != c2[:16]? True


b'Same message, different IVs -> different ciphertexts.'

### 4.3 Keystream reuse is catastrophic (stream ciphers)
We repeat the ChaCha20 reuse experiment but show recovery via a small known-plaintext fragment.


In [13]:
key = get_random_bytes(32)
nonce = get_random_bytes(8)
m1 = b"To: Alice\nAmount: 000 EUR\n"
m2 = b"To: Alice\nAmount: 999 EUR\n"
c1 = ChaCha20.new(key=key, nonce=nonce).encrypt(m1)
c2 = ChaCha20.new(key=key, nonce=nonce).encrypt(m2)

x = xor_bytes(c1, c2)
known_prefix = b"To: Alice\n"  # adversary guesses/knows this header
recovered_suffix = xor_bytes(x[:len(known_prefix)], known_prefix)
print("Leaked XOR for header ->", recovered_suffix)

# Full recovery if full m1 known
assert xor_bytes(x, m1) == m2
b"Recovered equals m2? "+ str(xor_bytes(x, m1) == m2).encode()

Leaked XOR for header -> b'To: Alice\n'


b'Recovered equals m2? True'

### 4.4 DES vs AES brute-force feasibility
Compute back-of-the-envelope time to brute force at **10^12 keys/second** (optimistic attacker):


In [14]:
def years_to_bruteforce(bits, rate_keys_per_sec=1e12):
    space = 2**bits
    seconds = space / rate_keys_per_sec
    years = seconds / (60*60*24*365.25)
    return years

print({
    'DES_56_bits_years': years_to_bruteforce(56),
    'AES128_bits_years': years_to_bruteforce(128),
    'AES256_bits_years': years_to_bruteforce(256),
})

{'DES_56_bits_years': 0.0022833673675415095, 'AES128_bits_years': 1.0782897524556319e+19, 'AES256_bits_years': 3.669229891921952e+57}


Interpretation:
- **DES (56 bits)** is brute-forceable with specialized hardware (practically broken).
- **AES-128** and **AES-256** have astronomically large key spaces; brute force is infeasible.


## 5. Questions & Solutions (Conceptual)
1. **Why was DES replaced by AES?**  
   DES has a **56-bit key**, vulnerable to brute force; AES supports 128/192/256-bit keys, secure design, efficient in software and hardware.  
2. **Why does ECB leak patterns?**  
   ECB encrypts each block independently; **identical plaintext blocks → identical ciphertext blocks**. No semantic security.  
3. **What do IVs/nonces do in CBC/CTR/GCM?**  
   They ensure **freshness**: encrypting the same message again yields different ciphertexts; in GCM/CTR they must be **unique** per key.  
4. **What happens if you reuse a stream-cipher nonce?**  
   The keystream repeats; **c1 ⊕ c2 = m1 ⊕ m2**. With known plaintext, the other message is recovered.  
5. **How does AES achieve security vs classical ciphers?**  
   Larger block/key sizes, strong confusion/diffusion, resistance to known attacks, and secure **modes of operation** provide semantic security and integrity (with AEAD modes).


## 6. Conclusions
- **AES** is the modern standard; **use AEAD modes** (GCM/OCB/ChaCha20-Poly1305) when possible.
- **Never** use ECB for sensitive data.
- **IV/nonce discipline** is critical: CBC needs **random IVs**; CTR/GCM need **unique nonces**.
- **Stream ciphers** are efficient and secure when used correctly; **keystream reuse breaks everything**.
