# Table of Contents<a name="toc"></a>
* [Convert hex to base64](#prob1)
* [Fixed XOR](#prob2)
* [Single-byte XOR cipher](#prob3)
* [Detect single-character XOR](#prob4)
* [Implementing repeating-key XOR](#prob5)
* [Breaking repeating-key XOR](#prob6)

# Convert hex to base64<a name="prob1"></a>

In [4]:
string_to_convert = "49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d"

In [5]:
import binascii, base64
output = base64.b64encode(binascii.unhexlify(string_to_convert)).decode()
print(f"base64 encoded: {output}")

base64 encoded: SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t


# Fixed XOR<a name="prob2"></a>

In [10]:
%%capture
!pip install numpy
import numpy as np

In [11]:
cipher = "1c0111001f010100061a024b53535009181c"

In [12]:
xor_block = "686974207468652062756c6c277320657965"

In [13]:
def xor_encrypt(cipher: bytes, block: bytes) -> bytes:
    cipher_npa = np.frombuffer(cipher, dtype=np.uint8)
    block_npa = np.frombuffer(block, dtype=np.uint8)
    return np.bitwise_xor(cipher_npa, block_npa).tobytes()

In [14]:
ciphertext = xor_encrypt(binascii.unhexlify(cipher), binascii.unhexlify(xor_block))
print(f"ciphertext, hex: {binascii.hexlify(ciphertext).decode()}")
print(f"ciphertext, bytes: {ciphertext}")

ciphertext, hex: 746865206b696420646f6e277420706c6179
ciphertext, bytes: b"the kid don't play"


# Single-byte XOR cipher<a name="prob3"></a>

In [15]:
%%capture
!pip install nltk
import nltk
nltk.download("words")
from nltk.corpus import words

In [16]:
ciphertext = "1b37373331363f78151b7f2b783431333d78397828372d363c78373e783a393b3736"

Iterate over all possible XOR values from 0x00 to 0xFF, and then compare all characters in the candidate cipher output against valid printable ASCII values, which range from 0x20 - 0x7E.

**Assumption** is that the output is printable ASCII.

In [27]:
ciphertext_b = binascii.unhexlify(ciphertext)
possible_results = {}
for i in range(0x100):
    block = np.ones(len(ciphertext_b), dtype=np.uint8) * i
    candidate = xor_encrypt(ciphertext_b, block.tobytes())
    if True not in map(lambda x: x < 0x20 or x > 0x7f, candidate):
        if True in map(lambda word: word in words.words(), candidate.decode().split()):
            possible_results[i] = candidate.decode()
for k, v in possible_results.items():
    print(f"XOR: 0x{k:02x}; cipher: {v}")

XOR: 0x58; cipher: Cooking MC's like a pound of bacon


# Detect single-character XOR<a name="prob4"></a>

In [28]:
class CipherCandidate:
    row: int
    xor: int
    cipher: str
    def __init__(self, row, xor, cipher):
        self.row = row
        self.xor = xor
        self.cipher = cipher
    def __str__(self):
        return f"Row: {self.row}; XOR: 0x{self.xor:02x}; Cipher: {self.cipher}"

In [30]:
results = []
fd = open("4.txt")
line = fd.readline()
idx = 0
while line:
    # remove newline
    if line.endswith("\n"):
        line = line[:-1]
    line_b = binascii.unhexlify(line)
    for xor in range(0x100):
        xor_block = np.ones(len(line_b), dtype=np.uint8) * xor
        cipher = xor_encrypt(line_b, xor_block.tobytes())
        if True not in map(lambda x: x < 0x20 or x > 0x7f, cipher):
            if True in map(lambda word: word in words.words(), cipher.decode().split()):
                results.append(CipherCandidate(idx, xor, cipher.decode()))
    line = fd.readline()
    idx += 1
for r in results:
    print(r)

Row: 225; XOR: 0x71; Cipher: F J\{?O#`F[KpB=, r}77OF'X}|cS
Row: 295; XOR: 0x70; Cipher: F9QoQt&unYk<(=w9R|X{Z #oVYq N


I clearly couldn't figure out what the message was supposed to be.

# Implementing repeating-key XOR<a name="prob5"></a>

In [42]:
phrase = "Burning 'em, if you ain't quick and nimble\nI go crazy when I hear a cymbal"

In [43]:
xor_block = b"ICE"

In [44]:
def xor_block_encrypt(cipher: bytes, xor_block: bytes) -> bytes:
    while len(cipher) > 0:
        if len(xor_block) > len(cipher):
            xor_block = xor_block[:len(cipher)]
            yield xor_encrypt(cipher, xor_block)
        segment = cipher[:len(xor_block)]
        cipher = cipher[len(xor_block):]
        yield xor_encrypt(segment, xor_block)

In [45]:
phrase_b = bytes(phrase.encode("utf8"))
result = b""
for ciphertext in xor_block_encrypt(phrase_b, xor_block):
    result += ciphertext
print(binascii.hexlify(result).decode())

0b3637272a2b2e63622c2e69692a23693a2a3c6324202d623d63343c2a26226324272765272a282b2f20430a652e2c652a3124333a653e2b2027630c692b20283165286326302e27282f282f


# Breaking repeating-key XOR<a name="prob6"></a>

## Implement Hamming Distance function
It's worth noting that the XOR operation itself will naturally generate a binary stream of where the two strings are different. Counting up the number of 1's in the output binary stream will result in the Hamming Distance between the two strings.

In [49]:
def hamming(str1: bytes, str2: bytes) -> int:
    diff = xor_encrypt(str1, str2)
    dist = 0
    for n in diff:
        a = np.array([i for i in map(lambda x: int(x), list(f"{n:08b}"))])
        dist += np.sum(a)
    return dist

Make sure the hamming distance between strings "`this is a test`" and "`wokka wokka!!!`" is `37`.

In [52]:
str1 = b"this is a test"
str2 = b"wokka wokka!!!"
hamming(str1, str2)

37

In [54]:
%%capture
!pip install matplotlib
import pylab as plt

## Determine XOR keysize