# Crypto Challenge Set 1

https://cryptopals.com/sets/1

## 1. Convert hex to base64

https://cryptopals.com/sets/1/challenges/1

In [56]:
data_str = "49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d" # string
data_b16 = bytes.fromhex(data_str)
print(f"{data_b16=}")

data_b16=b"I'm killing your brain like a poisonous mushroom"


In [354]:
from base64 import b16decode

data_hex = b"49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d" # bytes

#data_b16 = b16decode(data_hex.upper()) # b16decode is case sensitive, needs uppercase encoding
data_b16 = b16decode(data_hex,casefold=True) 
print(f"{data_b16=}")

data_b16=b"I'm killing your brain like a poisonous mushroom"


In [355]:
from base64 import b16decode, b64encode

def hex_to_b64(data_hex: bytes) -> bytes:
    return b64encode(b16decode(data_hex,casefold=True))

data_b64 = hex_to_b64(data_hex) # SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t
print(f"{data_b64=}")

data_b64=b'SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t'


## 2. Fixed XOR

https://cryptopals.com/sets/1/challenges/2

In [356]:
a = bytes.fromhex("1c0111001f010100061a024b53535009181c")
b = bytes.fromhex("686974207468652062756c6c277320657965")
print(f"{a=}, {b=}")

a=b'\x1c\x01\x11\x00\x1f\x01\x01\x00\x06\x1a\x02KSSP\t\x18\x1c', b=b"hit the bull's eye"


In [357]:
def _bytes_xor(a: bytes, b: bytes, quiet=True, check_lens=False) -> bytes:
    if not quiet:
        print(a, '\u2295', b)
    if check_lens and len(a) != len(b):
        raise ValueError("bytestring  lengths are not equal")
    return bytes(b1^b2 for b1, b2 in zip(a,b))

def bytes_xor(*args: bytes, quiet = True, check_lens=False) -> bytes:
    assert len(args) > 0
    result = args[0]
    for arg in args[1:]:
        result = _bytes_xor(result, arg, quiet=quiet, check_lens=check_lens)
    return result

In [358]:
xor_ab = bytes_xor(a,b,quiet=False)
print(f"{xor_ab=}")
print(f"{xor_ab.hex()=}")

b'\x1c\x01\x11\x00\x1f\x01\x01\x00\x06\x1a\x02KSSP\t\x18\x1c' ⊕ b"hit the bull's eye"
xor_ab=b"the kid don't play"
xor_ab.hex()='746865206b696420646f6e277420706c6179'


In [359]:
xa = 0x1c0111001f010100061a024b53535009181c
xb = 0x686974207468652062756c6c277320657965
hex(xa^xb)

'0x746865206b696420646f6e277420706c6179'

## 3. Single-byte XOR cipher

https://cryptopals.com/sets/1/challenges/3

* Character frequencies from a public domain book ("Jane Eyre")
* Score function comparing measured with expected frequencies
* Checking all xor with all 256 characters, returing plaintext with best score

In [417]:
from collections import defaultdict
from string import ascii_lowercase, ascii_uppercase, ascii_letters

def get_freqs(book,letters=ascii_letters):
    counts = defaultdict(int)
    for letter in letters:
        counts[letter] += book.count(letter)
    total = sum(counts.values())
    return {letter: counts[letter]/total for letter in counts}

with open("input/JaneEyre.txt") as f:
    book = f.read()

freqs_lowercase = get_freqs(book,ascii_lowercase)
freqs_uppercase = get_freqs(book,ascii_uppercase)
freqs_letters = get_freqs(book,ascii_letters)

from collections import Counter
counts = Counter(book)
freqs_all = {char: count/len(book) for char,count in counts.items() }

freqs_letters

{'a': 0.07803539798774925,
 'b': 0.013395918987177907,
 'c': 0.023684960647465267,
 'd': 0.04677742385598827,
 'e': 0.1267185226399993,
 'f': 0.020981379896661476,
 'g': 0.018664914069200216,
 'h': 0.056452807454514,
 'i': 0.06095006217488881,
 'j': 0.0008202853198801562,
 'k': 0.007660145460610745,
 'l': 0.04054873229198174,
 'm': 0.02590682604349877,
 'n': 0.06824674282001397,
 'o': 0.07664128636670861,
 'p': 0.015221956262237374,
 'q': 0.0011737921952154587,
 'r': 0.05948624497279615,
 's': 0.062279447184952555,
 't': 0.083662878915182,
 'u': 0.02979166744463074,
 'v': 0.009519790783677443,
 'w': 0.022626929506496935,
 'x': 0.0015472149508513416,
 'y': 0.02121414674767451,
 'z': 0.00040578606112432613,
 'A': 0.0014687961721678061,
 'B': 0.0008377117151431641,
 'C': 0.00043192565401883794,
 'D': 0.0005053654626272282,
 'E': 0.0006958110680015285,
 'F': 0.0005090996901835871,
 'G': 0.0006435318822125049,
 'H': 0.0013468114053267512,
 'I': 0.010201909683972322,
 'J': 0.0008177958348425

In [408]:
def score_text(text: bytes, freqs=freqs_letters) -> float:
    l = len(text)
    return sum([abs(text.count(ord(letter))/l - freq_exp) for letter, freq_exp in freqs.items()])

def crack_single_xor(cypher: bytes, freqs=freqs_letters) -> bytes:
    best_guess = (float('inf'), None, None) # score, plaintext guess, key guess
    best_key = ""
    for key in range(256):
        key_full = bytes([key])*len(cypher)
        plaintext = bytes_xor(cypher,key_full)
        score = score_text(plaintext, freqs=freqs_letters)
        curr_guess = (score, plaintext, bytes([key]))
        best_guess = min(best_guess, curr_guess)
    return best_guess

In [409]:
data_str = "1b37373331363f78151b7f2b783431333d78397828372d363c78373e783a393b3736"
cypher = bytes.fromhex(data_str)
best_guess = crack_single_xor(cypher,freqs_all)
print(f"{best_guess=}")

best_guess=(0.7704460124783241, b"Cooking MC's like a pound of bacon", b'X')


## 4. Detect single-character XOR

https://cryptopals.com/sets/1/challenges/4

In [363]:
best_guess = (float('inf'),None)
with open("input/4.txt") as f:
    for data_str in f.readlines():
        cypher = bytes.fromhex(data_str)
        plaintext = crack_single_xor(cypher)
        curr_guess = (score_text(plaintext),plaintext)
        best_guess = min(best_guess,curr_guess)
best_guess[1]

b'Now that the party is jumping\n'

## 5. Implement repeating-key XOR

https://cryptopals.com/sets/1/challenges/5

In [364]:
from itertools import cycle

def repeating_key_xor(plaintext: bytes, key: bytes):
    return bytes(p^k for p,k in zip(plaintext,cycle(key)))

In [365]:
plaintext = b"Burning 'em, if you ain't quick and nimble\nI go crazy when I hear a cymbal"
key = b"ICE"

cypher = repeating_key_xor(plaintext, key)
cypher.hex()

'0b3637272a2b2e63622c2e69692a23693a2a3c6324202d623d63343c2a26226324272765272a282b2f20430a652e2c652a3124333a653e2b2027630c692b20283165286326302e27282f'

## 6. Break repeating-key XOR

https://cryptopals.com/sets/1/challenges/6

In [374]:
def hamming_distance(s1: bytes, s2: bytes) -> int:
    return sum([bin(b1^b2).count("1") for b1,b2 in zip(s1,s2)])

s1 = b"this is a test"
s2 = b"wokka wokka!!!"

hamming_distance(s1,s2)

37

In [431]:
def guess_rep_key_xor(b: bytes, kmin=2, kmax=40, quiet=True):
    # guess keysize by testing testing several values and choosing that giving the smallest
    # normalised Hamming distance on blocks of that size
    keys = []
    for ks in range(kmin,kmax+1):
        # compute normalised Hamming distance between all combinations of blocks of size ks
        nbloc = len(b)//ks
        blocks = [ b[j*ks:(j+1)*ks] for j in range(nbloc) ]
        ndave = 0
        ncomb = 0
        for c in combinations(blocks,2):            
            ndave += hamming_distance(c[0],c[1])
            ncomb += 1
        ndave /= ncomb*ks
        keys.append((ks,ndave))

    # choose keysize as that giving smallest average Hamming distance between neigbouring blocks
    keys = sorted(keys,key=lambda x: x[1])
    keysize = keys[0][0]
    if not quiet:
        print("Guessed KEYSIZE =",keysize)
    
    # Break the ciphertext into blocks of KEYSIZE length, then transpose the blocks. Make a block that is the
    # first byte of every block, and a block that is the second byte of every block, and so on.
    # This is because each corresponding byte in all blocks has been encrypted with the same key character,
    # thus I can try to guess the key character it using the single-character XOR attach implemented at point 4.
    nblocks = len(b)//keysize # I'm skipping the last part of the cypher, I could maybe pad it to use the last block
    blocks = []
    for k in range(keysize):
        tblock = []
        for i in range(nblocks):
            tblock.append(b[k+keysize*i])
        blocks.append(tblock)

    # Solve each block as if it was single-character XOR, recompose the key!
    key = b""
    for block in blocks:
        best_guess = crack_single_xor(block)
        key += best_guess[2]
    if not quiet:
        print("Guessed KEY =",key)
    return key

In [442]:
from base64 import b64decode

with open("input/6.txt") as f:
    cipher6_b64 = f.read()

cipher6 = b64decode(cipher6_b64)
key = guess_rep_key_xor(cipher6,quiet=False)

Guessed KEYSIZE = 29
Guessed KEY = b'Terminator X: Bring the noise'


In [443]:
# XOR is commutative: I can decode with the same algorithm used to encode with repeating-key XOR
plaintext6 = repeating_key_xor(cipher6, key)
print(plaintext6.decode()) # the output is binary, decode() converts it to regular string

I'm back and I'm ringin' the bell 
A rockin' on the mike while the fly girls yell 
In ecstasy in the back of me 
Well that's my DJ Deshay cuttin' all them Z's 
Hittin' hard and the girlies goin' crazy 
Vanilla's on the mike, man I'm not lazy. 

I'm lettin' my drug kick in 
It controls my mouth and I begin 
To just let it flow, let my concepts go 
My posse's to the side yellin', Go Vanilla Go! 

Smooth 'cause that's the way I will be 
And if you don't give a damn, then 
Why you starin' at me 
So get off 'cause I control the stage 
There's no dissin' allowed 
I'm in my own phase 
The girlies sa y they love me and that is ok 
And I can dance better than any kid n' play 

Stage 2 -- Yea the one ya' wanna listen to 
It's off my head so let the beat play through 
So I can funk it up and make it sound good 
1-2-3 Yo -- Knock on some wood 
For good luck, I like my rhymes atrocious 
Supercalafragilisticexpialidocious 
I'm an effect and that you can bet 
I can take a fly girl and make her wet. 


## 7. AES in ECB mode

https://cryptopals.com/sets/1/challenges/7

Cypher encrypted via AES-128 in ECB mode under the key "YELLOW SUBMARINE"

### 7.1 Using PyCryptoDome library:

https://pycryptodome.readthedocs.io/en/latest/src/installation.html

In [455]:
from Cryptodome.Cipher import AES

def aes_ecb_decode(cipher: bytes, key: bytes) -> bytes:
    aes = AES.new(key, AES.MODE_ECB) 
    return aes.decrypt(cipher)

In [459]:
with open("input/7.txt") as f:
    cipher7_b64 = f.read()
    cipher7 = b64decode(cipher7_b64)
    
key7 = b"YELLOW SUBMARINE" # key should be bytes, in case of string use encode() to convert
plaintext7 = aes_ecb_decode(cipher7,key7)
print(plaintext7.decode()) # the output is binary, decode() converts it to regular string

I'm back and I'm ringin' the bell 
A rockin' on the mike while the fly girls yell 
In ecstasy in the back of me 
Well that's my DJ Deshay cuttin' all them Z's 
Hittin' hard and the girlies goin' crazy 
Vanilla's on the mike, man I'm not lazy. 

I'm lettin' my drug kick in 
It controls my mouth and I begin 
To just let it flow, let my concepts go 
My posse's to the side yellin', Go Vanilla Go! 

Smooth 'cause that's the way I will be 
And if you don't give a damn, then 
Why you starin' at me 
So get off 'cause I control the stage 
There's no dissin' allowed 
I'm in my own phase 
The girlies sa y they love me and that is ok 
And I can dance better than any kid n' play 

Stage 2 -- Yea the one ya' wanna listen to 
It's off my head so let the beat play through 
So I can funk it up and make it sound good 
1-2-3 Yo -- Knock on some wood 
For good luck, I like my rhymes atrocious 
Supercalafragilisticexpialidocious 
I'm an effect and that you can bet 
I can take a fly girl and make her wet. 


* Note that last 4 characters in the plaintext: this is an effect of padding (see further challenges)

## 8. Detect AES in ECB mode

https://cryptopals.com/sets/1/challenges/8

https://en.wikipedia.org/wiki/Block_cipher_mode_of_operation#Electronic_codebook_(ECB)

> Remember that the problem with ECB is that it is stateless and deterministic; the same 16 byte plaintext block will always produce the same 16 byte ciphertext.

I can split the ciphers in such blocks and look for possible repetitions.

In [471]:
def bytes_to_chuncks(b: bytes, chunksize=16) -> list:
    return [ b[i:i+chunksize] for i in range(0,len(b),chunksize) ]

def detect_aes_ecb_mode(cipher: bytes, blocksize=16):
    blocks = bytes_to_chuncks(cipher,blocksize)
    return len(blocks) - len(set(blocks))

In [476]:
with open("input/8.txt") as f:
    ciphers8 = [ bytes.fromhex(l.strip()) for l in f.readlines() ]

for l,cipher in enumerate(ciphers8):
    rep = detect_aes_ecb_mode(cipher, blocksize=16)
    if rep:
        print(f"Cipher at line {l} has {rep} block repetitions")

Cipher at line 132 has 3 block repetitions
