# PDA data science - Python practice
<div class="alert alert-block alert-info">
    Notebook 1: by michael.ferrie@edinburghcollege.ac.uk <br> Edinburgh College, January 2024
</div>

In cryptography, a Caesar cipher, also known as Caesar's cipher, the shift cipher, Caesar's code or Caesar shift, is one of the simplest and most widely known encryption techniques. It is a type of substitution cipher in which each letter in the plaintext is replaced by a letter some fixed number of positions down the alphabet. For example, with a left shift of 3, D would be replaced by A, E would become B, and so on. The method is named after Julius Caesar, who used it in his private correspondence.

Encryption can also be represented using modular arithmetic by first transforming the letters into numbers, according to the scheme, A ‚Üí 0, B ‚Üí 1, ..., Z ‚Üí 25. Encryption of a letter x by a shift n can be described mathematically as:

$$ {\displaystyle E_{n}(x)=(x+n)\mod {26}.}  $$


Decryption is performed similarly:

$$ {\displaystyle D_{n}(x)=(x+n)\mod {26}.}  $$

There are different definitions for the modulo operation. Often in python using lists we need to use the range 0 to 25 due to indexing starting at 0. Look at the following example then answer the questions below.

First a list of letters is created called `letters`. Then two variables are hardcoded, `key` and `value` representing the input to the program. Then a for loop gets the index of the value and increments it by the key, modulus 26, then prints out the value at the new index in the list. This program only works for a right shift in the alphabet.

In [None]:
# Create letters list
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
           'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

# Define key and value
key = 3
value = 'hello there'
#test_value = 'jfzexbi'

# Loop over value and find new index in list based on key
for st in value:
    index = letters.index(st)
    index += key
    index %= 26
    print(letters[index])

### Questions

1) Adapt the program in the next cell, so that it can handle spaces in the value variable?

In [1]:
# Create letters list
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
           'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

# Define key and value
key = 2
value = 'h g'

# Loop over letters, find new value in list by key
for st in value:
    if st == ' ':
        print(st)
    else:
        index = letters.index(st)
        index += key
        index %= 26
        print(letters[index])

j
 
i


2) Adapt the program in the next cell, so that the program asks the user to enter the value?

In [None]:
# Create letters list
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
           'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

# Define key and value
key = 2
prompt = "Enter the text to be encrypted:\n"
value = input(prompt)

# Loop over letters, find new value in list by key
for st in value:
    index = letters.index(st)
    index += key
    index %= 26
    print(letters[index])

r
k
r
r
q


3) Adapt the program in the next cell, so that the program asks the user to enter the key?

In [None]:
# Create letters list
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
           'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

# Define key and value
prompt = "Enter the encryption key, an integer between 1 and 26:\n"
key = int(input(prompt))
value = 'h'

# Loop over letters, find new value in list by key
for st in value:
    index = letters.index(st)
    index += key
    index %= 26
    print(letters[index])

t


4) Adapt the program in the next cell so that it can accept uppercase letters and convert them to lowercase?

In [None]:
# Create letters list
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
           'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

# Define key and value
key = 2
value = 'H'

# Loop over letters, find new value in list by key
for st in value.lower():
    index = letters.index(st)
    index += key
    index %= 26
    print(letters[index])

5) Combine all of your improvements to the program into a new final program and add some exception handling.

* The program should stop with an error message if the user enters an integer as a value or a non-interger as a key.
* The program should say `Error: Values must be alphabetical and keys must be numeric` if incorrect values are entered.
* Add useful comments to the program describing what it does.

In [None]:
# Create letters list
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
           'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

# Answer for Q5 here:

# Create numbers list
nums = list("0123456789")


# Prompt for a key until a valid key is entered
prompt = "Enter the encryption key, an integer between 1 and 26:\n"
while True:

    try:
        # Try converting the input sting into an integer
        key = int(input(prompt))
    except Exception:
        # If conversion is not possible, the user has entered a non-integer as a key
        # An error message is printed and the program stops
        prompt = "Error: Values must be alphabetical and keys must be numeric. Try again: \n"
    else:
        # If conversion is possible, check if they key is in the correct range
        if (key > 0 and key < 27):
            break
        else:
            prompt = "Error: Keys must be between 1 and 26. Try again: \n"

# Prompt for a value until a valid key is entered
prompt = "Enter the text to be encrypted:\n"
while True:

    value = input(prompt)
    validValue = True
    # Check if there is any number in the input value
    for char in list(value):
        if char in nums:
            prompt = "Error: Values must be alphabetical and keys must be numeric. Try again: \n"
            validValue = False
            break
    if validValue:
        break

# Loop over letters, find new value in list by key
for s in value.lower():
    if s == ' ':
        print(s)
    else:
        index = letters.index(s)
        index += key
        index %= 26
        print(letters[index])


r
q
r
r
r


In [None]:
# Create letters list
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
           'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

# Answer for Q5 here:

# Create numbers list
nums = list("0123456789")

# Define key and value
prompt = "Enter the encryption key, an integer between 1 and 26:\n"
try:
    # Try converting the input sting into an integer
    key = int(input(prompt))
except Exception:
    # If conversion is not possible, the user has entered a non-integer as a key
    # An error message is printed and the program stops
    print("Error: Values must be alphabetical and keys must be numeric")
else:
    # If conversion is possible, the program continues normally
    prompt = "Enter the text to be encrypted:\n"
    value = input(prompt)

    validValue = True
    # Check if there is any number in the input value
    for char in list(value):
        if char in nums:
            validValue = False
            break
    # There are no numbers in the input: proceed normally
    if validValue:
        # Loop over letters, find new value in list by key
        for s in value.lower():
            if s == ' ':
                print(s)
            else:
                index = letters.index(s)
                index += key
                index %= 26
                print(letters[index])
    # A number was found in the value: print error message
    else:
        print("Error: Values must be alphabetical and keys must be numeric")


6. The biggest limitation of the Caesar Cipher is that if the key is intercepted, the message can easily be decoded, there is a far stonger form of encryption known as the [One Time Pad (OTP)](https://www.cryptomuseum.com/crypto/otp/index.htm), in which a unique key is generated for every message and is used only once.

* Write a program, that is an adaptation of the program from question 5.
* It should still ask the user to input a value to encrypt, but this time your program should also generate a list of random numbers from 1-26 called `otp_list` that is the same length as the value sting, this will be our key steam.
* Each character in the value sting should be matched to it's position in the `letters` list and then be shifted by the corresponding number in the `otp_list`
* Then use this to to output the value and the value encrypted by the OTP Cipher.

For example if the value sting was input as `cat` and a OTP list was generated of `234` the program should shift the c by 2, the a by 3 and the t by 4, resulting in: `edx`.

In [None]:
# Answer for Q6 here

import random

# Create letters list
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
           'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

# Answer for Q5 here:

# Create numbers list
nums = list("0123456789")

# Define value
prompt = "Enter the text to be encrypted:\n"
value = input(prompt)

validValue = True
# Check if there is any number in the input value
for char in list(value):
    if char in nums:
        validValue = False
        break
# There are no numbers in the input: proceed normally
if validValue:

    # Generate random list of keys
    vl = len(value)
    key = []
    for i in range(vl):
        key.append(random.randint(1, 26))
    print(key)

    # Loop over letters, find new value in list by key
    for i, st in enumerate(value.lower()):
        # If the letter is a space, print it unchanged
        if st == ' ':
            print(st)
        # Encode the letter with the key corresponding to that letter
        else:
            index = letters.index(st)
            index += key[i]
            index %= 26
            print(letters[index])
# A number was found in the value: print error message
else:
    print("Error: Values must be alphabetical and keys must be numeric")


<div class="alert alert-block alert-warning">
<b>Challenge questions:</b> If you found the other questions easy, attempt to complete these.
</div>

7. Now we have created two programs for generating ciphers, now let's look at a way to send the information over a network, it is common to perform an XOR bitwise operation on a value and it's key to generate an encrypted message.

* Adapt the following code to perform the exclusive or XOR operation on the key and the value to generate a cipher.

* Iterate over the key and the value and generate the XOR value.

In [4]:
# Converting values to ciphertext
value = "hello"
print(" ".join(f"{ord(i):08b}" for i in value))

key = "12345"
print(" ".join(f"{ord(i):08b}" for i in key))

# Generating XOR's
def xor(x, y):
    return bool((x and not y) or (not x and y))
print(xor(0,0))
print(xor(0,1))
print(xor(1,0))
print(xor(1,1))

01101000 01100101 01101100 01101100 01101111
00110001 00110010 00110011 00110100 00110101
False
True
True
False


In [None]:
# Your code here for Q7

import random

# Create letters list
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
           'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

# Answer for Q5 here:

# Create numbers list
nums = list("0123456789")

# Define value
prompt = "Enter the text to be encrypted:\n"
value = input(prompt)

validValue = True
# Check if there is any number in the input value
for char in list(value):
    if char in nums:
        validValue = False
        break
# There are no numbers in the input: proceed normally
if validValue:

    # Generate random key
    vl = len(value)
    key = []
    for i in range(vl):
        key.append(str(random.randint(0, 9)))
    # print(key)

    binary_value = "".join(f"{ord(i):08b}" for i in value)
    binary_key = "".join(f"{ord(i):08b}" for i in key)

    # print(binary_key, binary_value)

    xor_value = ''

    for i, el in enumerate(binary_value):
        xored_el = str(int(xor(int(el), int(binary_key[i]))))
        xor_value += xored_el

    print(xor_value)

# A number was found in the value: print error message
else:
    print("Error: Values must be alphabetical and keys must be numeric")

0100000001011110010010010100100101011101


8. As shown here XOR works to generate a cipher because it is it's own inverse. In such that
ùëé = (ùëé ‚äï ùëè) ‚äï ùëè

* Write a program to demonstate that using your key and value from question 7, you can XOR the ciphertext back against the key to generate the value?
* Once you have the value, use the decoder to turn it back into plain text

In [2]:
# Decoder function
def decode_binary_string(s):
    return ''.join(chr(int(s[i*8:i*8+8],2)) for i in range(len(s)//8))
x='0110100001100101011011000110110001101111'
decode_binary_string(x)

'hello'

In [9]:
# Your code here for Q8

import random

# Create letters list
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
           'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']

# Answer for Q5 here:

# Create numbers list
nums = list("0123456789")

# Define value
prompt = "Enter the text to be encrypted:\n"
value = input(prompt)

validValue = True
# Check if there is any number in the input value
for char in list(value):
    if char in nums:
        validValue = False
        break
# There are no numbers in the input: proceed normally
if validValue:

    # Generate random key
    vl = len(value)
    key = []
    for i in range(vl):
        key.append(str(random.randint(0, 9)))


    # Convert key and value to binary
    binary_value = "".join(f"{ord(i):08b}" for i in value)
    binary_key = "".join(f"{ord(i):08b}" for i in key)

    # XOR key and value
    ciphertext = ''
    for i, el in enumerate(binary_value):
        xored_el = str(int(xor(int(el), int(binary_key[i]))))
        ciphertext += xored_el

    print(ciphertext)

    # XOR the ciphertext with the binary key to get back the binary value

    deciphered_value = ''
    for i, el in enumerate(ciphertext):
        xored_el = str(int(xor(int(el), int(binary_key[i]))))
        deciphered_value += xored_el

    print("Deciphered value: ", deciphered_value)
    print("Deciphered value = binary value?: ",  deciphered_value == binary_value)

    decoded_value = decode_binary_string(deciphered_value)

    print("Decoded value: ", decoded_value)
    print("Decoded value = original value?: ", decoded_value == value)

# A number was found in the value: print error message
else:
    print("Error: Values must be alphabetical and keys must be numeric")


0110010001011010010001010100010101011001
Deciphered value:  0101000001101001011100000111000001101111
Deciphered value = binary value?:  True
Decoded value:  Pippo
Decoded value = original value?:  True
