
## Step 1 - Convert hex to base64

> The string:
> ```
> 49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d
> ```
> Should produce:
> ```
> SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t
> ```
> So go ahead and make that happen. You'll need to use this code for the rest of the exercises.
>
> ### Comment
>
> Always operate on raw bytes, never on encoded strings. Only use hex and base64 for pretty-printing.

In [2]:
# your code here ... (put some comments to explain what you did)
from base64 import b64encode, b64decode

str = "49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d"

def hexToBase64(input: str) -> str:
    base64 = b64encode(bytes.fromhex(input)).decode()
    return base64

hexToBase64(str)

'SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t'

## Step 2 - Fixed XOR

> Write a function that takes two equal-length buffers and produces their XOR combination.
>
> If your function works properly, then when you feed it the string:
> ```
> 1c0111001f010100061a024b53535009181c
> ```
> ... after hex decoding, and when XOR'd (bitwise) against:
> ```
> 686974207468652062756c6c277320657965
> ```
> ... should produce:
> ```
> 746865206b696420646f6e277420706c6179
> ```

In [9]:
# your code with comments ... (feel free to add as many as helper functions as you need!)
def hex_to_binary(hex_str: str) -> str:
    """Convert a hexadecimal string to a binary string."""
    binary_str = bin(int(hex_str, 16))[2:]
    return binary_str.zfill(len(hex_str) * 4)

def binary_to_hex(binary_str: str) -> str:
    """Convert a binary string to a hexadecimal string."""
    binary_str = binary_str.zfill((len(binary_str) + 3) // 16)
    hex_str = hex(int(binary_str, 2))[2:]
    return hex_str

def fixed_xor(first_buffer: str, second_buffer: str) -> str:
    """Perform a fixed XOR on two hexadecimal strings."""
    binary_first = hex_to_binary(first_buffer)
    binary_second = hex_to_binary(second_buffer)
    
    # Ensure the input strings are of equal length
    if len(binary_first) != len(binary_second):
        raise ValueError("Input strings must be of equal length")

    # Perform bitwise XOR on each pair of corresponding bits
    result_binary = ''.join('1' if bit1 != bit2 else '0' for bit1, bit2 in zip(binary_first, binary_second))
    
    # Convert the binary result back to hex
    hex_result = binary_to_hex(result_binary)
    return hex_result

first_buffer = '1c0111001f010100061a024b53535009181c'
second_buffer = '686974207468652062756c6c277320657965'

result = fixed_xor(first_buffer, second_buffer)
print(result)


746865206b696420646f6e277420706c6179


## Step 3 - Single-byte XOR cipher

> The hex encoded string:
> ```
> 1b37373331363f78151b7f2b783431333d78397828372d363c78373e783a393b3736
> ```
> ... has been XOR'd against a single character. Find the key (which is one byte) and decrypt the message. The message is a meaningful sentence in English!
>
> You should write a code to find the key and decrypt the message. Don't do it manually!
>
> ### Comment
> There are several mini steps to achieve this! First, you need a strategy for searching in the key space. Second, you need a test/scoring mechanism to check whether the decrypted message is  meaningful or not (i.e., detecting garbage vs. the correct output). You can read more about *"Caesar"* cipher to get some ideas and more background!

#### Description
*A brief description of your approach. Don't just put the code. First explain what you did and WHY you did it!*

<p> (your description)<br>
My strategy was to bruteforce through the key space (lowercase and uppercase english letters), and XOR the bitstring with each key I'm searching. I then defined a function that gave the decoded text a score depending on how many common english words it contained, and used that as my metric to see if a decypted message is meaningful. Then I simply returned the decrypted text and key that gave the highest score
</p>

In [31]:
def single_byte_xor_cipher(hex_string, key):
    hex_bytes = bytes.fromhex(hex_string)

    # XOR each byte with the key (ASCII value of the provided key)
    decrypted_bytes = bytes([byte ^ key for byte in hex_bytes])

    # Convert the result to a string
    decrypted_text = decrypted_bytes.decode('utf-8', errors='ignore')

    return decrypted_text

def score_decryption(text):
    # Count the number of common English words
    common_words = ["the", "and", "is", "it", "in", "to", "that", "you", "with", "for", "a", "like"]
    word_count = sum(1 for word in common_words if word in text.lower())
    # Penalize outputs with unreadable ASCII output
    non_printable_penalty = text.count('\x07') + text.count('\x00')  # Counting ASCII 7 (bell) and ASCII 0 (null)
    return word_count - non_printable_penalty

def xor_cipher_solver(text):
    #Set counting variables to find hgighest score
    topscore = 0
    plaintext = ''
    bestkey = ''
    # Bruteforce through lowercase and uppercase letters as possible keys
    for key in range(ord('a'), ord('z') + 1):
        decrypted_message = single_byte_xor_cipher(text, key)
        score = score_decryption(decrypted_message)
        if (score > topscore):
            plaintext = decrypted_message
            topscore = score
            bestkey = key
        # print(f"Key {chr(key)}: {decrypted_message}  Score: {score}")

    for key in range(ord('A'), ord('Z') + 1):
        decrypted_message = single_byte_xor_cipher(text, key)
        score = score_decryption(decrypted_message)
        if (score > topscore):
            plaintext = decrypted_message
            topscore = score
            bestkey = key
        # print(f"Key {chr(key)}: {decrypted_message}  Score: {score}")
    return (plaintext,chr(bestkey))

hex_encoded_message = '1b37373331363f78151b7f2b783431333d78397828372d363c78373e783a393b3736'
xor_cipher_solver(hex_encoded_message)


("Cooking MC's like a pound of bacon", 'X')

## Step 4 - Detect single-character XOR

> One of the 60-character strings in [this file](data/04.txt) has been encrypted by single-character XOR (each line is one string).
>
> Find it.
>
> ### Comment
> You should use your code in Step 3 to test each line. One line should output a meaningful message. Remeber that you don't know the key either but you can find it for each line (if any). 

#### Description
*A brief description of your approach. Don't just put the code. First explain what you did and WHY you did it!*

<p> (your description)<br>
...
</p>

Decrypted Message: bNNJHOFlbRMHJD@QNTOENGC@BNO
Key: y


## Step 5 - Implement repeating-key XOR

> Here is the opening stanza of an important work of the English language:
> ```
> Burning 'em, if you ain't quick and nimble
> I go crazy when I hear a cymbal
> ```
> Encrypt it, under the key "ICE", using repeating-key XOR.
>
> In repeating-key XOR, you'll sequentially apply each byte of the key; the first byte of plaintext will be XOR'd against I, the next C, the next E, then I again for the 4th byte, and so on.
>
> It should come out to:
> ```
> 0b3637272a2b2e63622c2e69692a23693a2a3c6324202d623d63343c2a26226324272765272
> a282b2f20430a652e2c652a3124333a653e2b2027630c692b20283165286326302e27282f
> ```


In [None]:
# your code with comments

## Step 6 (Main Step) - Break repeating-key XOR

> There's a file [here](data/06.txt). It's been base64'd after being encrypted with repeating-key XOR.
>
> Decrypt it.
>
> Here's how:
>
> - Let KEYSIZE be the guessed length of the key; try values from 2 to (say) 40.
>
> - Write a function to compute the edit distance/Hamming distance between two strings. The Hamming distance is just the number of differing bits. The distance between:
```"this is a test"```
and
```"wokka wokka!!!"```
is 37. Make sure your code agrees before you proceed.
>
> - For each KEYSIZE, take the first KEYSIZE worth of bytes, and the second KEYSIZE worth of bytes, and find the edit distance between them. Normalize this result by dividing by KEYSIZE.
>
> - The KEYSIZE with the smallest normalized edit distance is probably the key. You could proceed perhaps with the smallest 2-3 KEYSIZE values. Or take 4 KEYSIZE blocks instead of 2 and average the distances.
>
> - Now that you probably know the KEYSIZE: break the ciphertext into blocks of KEYSIZE length.
>
> - Now transpose the blocks: make a block that is the first byte of every block, and a block that is the second byte of every block, and so on.
>
> - Solve each block as if it was single-character XOR. You already have code to do this.
> For each block, the single-byte XOR key that produces the best looking histogram is the repeating-key XOR key byte for that block. Put them together and you have the key.

#### Description
*A brief description of your approach. Don't just put the code. First explain what you did and WHY you did it!*

<p> (your description)<br>
...
</p>

In [None]:
# your code with comments