# Project Euler
## Problem 59
### XOR decryption

<p>Each character on a computer is assigned a unique code and the preferred standard is ASCII (American Standard Code for Information Interchange). For example, uppercase A = 65, asterisk (*) = 42, and lowercase k = 107.</p>
<p>A modern encryption method is to take a text file, convert the bytes to ASCII, then XOR each byte with a given value, taken from a secret key. The advantage with the XOR function is that using the same encryption key on the cipher text, restores the plain text; for example, 65 XOR 42 = 107, then 107 XOR 42 = 65.</p>
<p>For unbreakable encryption, the key is the same length as the plain text message, and the key is made up of random bytes. The user would keep the encrypted message and the encryption key in different locations, and without both "halves", it is impossible to decrypt the message.</p>
<p>Unfortunately, this method is impractical for most users, so the modified method is to use a password as a key. If the password is shorter than the message, which is likely, the key is repeated cyclically throughout the message. The balance for this method is using a sufficiently long password key for security, but short enough to be memorable.</p>
<p>Your task has been made easy, as the encryption key consists of three lower case characters. Using <a href="https://projecteuler.net/project/resources/p059_cipher.txt">p059_cipher.txt</a> (right click and 'Save Link/Target As...'), a file containing the encrypted ASCII codes, and the knowledge that the plain text must contain common English words, decrypt the message and find the sum of the ASCII values in the original text.</p>

### Solution

The first step is to convert the text in the text file into a Python-readable
datatype. I decided to convert the text into a list of integers:

In [1]:
file_name = "p059_cipher.txt"
with open(file_name, "r") as file:
    text = file.read()

text = text.split(",")
char_list = []
for char in text:
    char_list.append(int(char))

Looking at the text file, the numbers 88 and 80 show up very frequently, and
their appearances seem to be periodic. I deduced that these characters were
spaces. To make sure they were appearing frequently and I wasn't tricking
myself, I got a count of each number in the list:

In [2]:
from pprint import pprint

counts = {}
for number in char_list:
    counts.setdefault(number, 0)
    counts[number] += 1
sorted_counts = sorted(counts.items(), key=lambda item: item[1], reverse=True)
counts = dict(sorted_counts)
pprint(counts, sort_dicts=False)

{80: 107,
 69: 86,
 88: 77,
 0: 75,
 17: 73,
 29: 70,
 21: 65,
 12: 65,
 4: 61,
 22: 56,
 10: 52,
 23: 46,
 11: 43,
 25: 42,
 16: 38,
 3: 36,
 13: 33,
 2: 31,
 30: 26,
 28: 25,
 8: 25,
 31: 24,
 20: 22,
 19: 21,
 24: 21,
 9: 20,
 1: 19,
 84: 16,
 5: 15,
 6: 11,
 18: 11,
 27: 10,
 7: 9,
 65: 9,
 87: 9,
 83: 8,
 73: 7,
 26: 7,
 70: 7,
 92: 6,
 67: 5,
 75: 5,
 86: 4,
 94: 4,
 81: 4,
 68: 3,
 78: 3,
 72: 3,
 64: 3,
 77: 3,
 49: 3,
 36: 2,
 66: 2,
 82: 2,
 35: 2,
 95: 2,
 91: 2,
 15: 2,
 44: 2,
 14: 2,
 76: 2,
 74: 2,
 61: 1,
 71: 1,
 60: 1,
 63: 1,
 45: 1,
 57: 1,
 54: 1,
 62: 1,
 89: 1}


Characters 80, 69, and 88 are the most frequent numbers. If they are all the 
same character, then their positions in the list should be constant mod 3 
(since the key is 3 characters long). If we look at their positions mod 3, 
we can see that this is true, save for a few outliers:

In [3]:
for i, char in enumerate(char_list):
    if char == 80:
        print(i % 3, end=' ')

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 0 0 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 

In [4]:
for i, char in enumerate(char_list):
    if char == 69:
        print(i % 3, end=' ')

0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 

In [5]:
for i, char in enumerate(char_list):
    if char == 88:
        print(i % 3, end=' ')

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Since the most common character is likely to be the space character, we can
take the ascii number of the space character and XOR it with 69, 88, and 80, 
in that order, to get the key of this message.

In [6]:
ord(" ")

32

In [7]:
key = f"{chr(69 ^ 32)}{chr(88 ^ 32)}{chr(80 ^ 32)}"
print(key)

exp


A fitting key for Project Euler. :)

Now, we decrypt the message, saving both the plaintext characters for the 
message and the ascii numbers for finding the sum that the problem wants.

In [8]:
plaintext_ascii = []
plaintext = []
for i, char in enumerate(char_list):
    ascii_number = char ^ ord(key[i%3])
    plaintext.append(chr(ascii_number))
    plaintext_ascii.append(ascii_number)

In [9]:
print("".join(plaintext))

An extract taken from the introduction of one of Euler's most celebrated papers, "De summis serierum reciprocarum" [On the sums of series of reciprocals]: I have recently found, quite unexpectedly, an elegant expression for the entire sum of this series 1 + 1/4 + 1/9 + 1/16 + etc., which depends on the quadrature of the circle, so that if the true sum of this series is obtained, from it at once the quadrature of the circle follows. Namely, I have found that the sum of this series is a sixth part of the square of the perimeter of the circle whose diameter is 1; or by putting the sum of this series equal to s, it has the ratio sqrt(6) multiplied by s to 1 of the perimeter to the diameter. I will soon show that the sum of this series to be approximately 1.644934066842264364; and from multiplying this number by six, and then taking the square root, the number 3.141592653589793238 is indeed produced, which expresses the perimeter of a circle whose diameter is 1. Following again the same ste

In [10]:
print(sum(plaintext_ascii))

129448
