## Question 385

# Description

This problem was asked by Apple.

You are given a hexadecimal-encoded string that has been XOR'd against a single char.

Decrypt the message. For example, given the string:

`7a575e5e5d12455d405e561254405d5f1276535b5e4b12715d565b5c551262405d505e575f`

You should be able to decrypt it and get:

`Hello world from Daily Coding Problem`


In [1]:
from collections import Counter


def frequency_analysis_decrypt(hex_string):
    """
    decrypts a hex string by frequency analysis
    """
    hex_bytes = bytes.fromhex(hex_string)

    # count the frequency of each byte in the hex string
    freq = Counter(hex_bytes)

    # most common byte is in the hex string
    most_common_bytes = [x[0] for x in freq.most_common(4)]

    # common characters in English
    common_chars = [ord("e"), ord("t"), ord("a"), ord("o")]

    decrypted_messages = []

    # try XOR with the most common bytes against common English characters
    for byte in most_common_bytes:
        for char in common_chars:
            key = byte ^ char
            decrypted = "".join([chr(b ^ key) for b in hex_bytes])
            if all(32 <= ord(c) <= 126 for c in decrypted):
                decrypted_messages.append((chr(key), decrypted))

    return decrypted_messages

In [2]:
# Decrypt the message using frequency analysis
decrypted_messages_frequency_analysis = frequency_analysis_decrypt('7a575e5e5d12455d405e561254405d5f1276535b5e4b12715d565b5c551262405d505e575f')
decrypted_messages_frequency_analysis

[(';', 'Aleef)~f{em)o{fd)Mh`ep)Jfm`gn)Y{fkeld'),
 ('1', 'Kfool#tlqog#eqln#Gbjoz#@lgjmd#Sqlaofn'),
 ('8', 'Boffe*}exfn*lxeg*Nkcfs*Iencdm*Zxehfog'),
 ('<', 'Fkbba.ya|bj.h|ac.Jogbw.Majg`i.^|albkc'),
 ('2', 'Hello world from Daily Coding Problem'),
 ('%', '_r{{x7`xe{s7qexz7Sv~{n7Txs~yp7Gexu{rz')]

## Analysis

Given the nature of the problem, let's try the Frequency Analysis method, assuming the original message is in English. The most common letters in the English language are 'e', 't', 'a', and 'o'. We can attempt to decrypt the message by XOR'ing the most common characters in the hexadecimal string against these letters.

Using frequency analysis, the decryption process successfully revealed the expected message:

`Hello world from Daily Coding Problem`

This message was obtained by XOR'ing the most common bytes in the hexadecimal string against common English characters (e.g., 'e', 't', 'a', 'o'). The key that resulted in this decryption was the ASCII character '2'.


The complexity analysis for the decryption method we used, particularly the frequency analysis approach, can be broken down into two parts: the complexity of finding the most common bytes in the hex string, and the complexity of trying each of these bytes as a key against a set of common English characters.

1. **Finding Most Common Bytes**:

   - We first convert the hexadecimal string into bytes, which is an \( O(n) \) operation where \( n \) is the length of the hexadecimal string.
   - Then, we use a counter to find the frequency of each byte. Counting each byte in a sequence of length \( n \) is also an \( O(n) \) operation.
   - Extracting the most common bytes from this frequency distribution is generally \( O(1) \) since we are only interested in the top few elements (not dependent on \( n \)).

2. **Trying Each Byte as a Key**:
   - For each of the most common bytes, we try XOR'ing this byte with each of the common English characters. If we assume there are \( m \) common bytes and \( k \) common English characters, this results in \( m \times k \) trials.
   - Each XOR operation over the sequence of bytes is \( O(n) \), so the total complexity for this part is \( O(m \times k \times n) \). Given that \( m \) and \( k \) are small and fixed (independent of \( n \)), this can be simplified to \( O(n) \).

Overall, the complexity of the entire process is dominated by the length of the hexadecimal string, leading to a **total time complexity of \( O(n) \)**.

Regarding space complexity:

- We store the frequency distribution of the bytes, which, in the worst case, could have as many entries as there are distinct bytes in the input. However, since the range of a byte is fixed (0-255), this is a constant \( O(1) \) space complexity.
- The space required for storing the decrypted messages is proportional to the length of the hex string, so it is \( O(n) \).

Thus, the **total space complexity is \( O(n) \)**, primarily due to the storage of the decrypted messages.
