# Exploring the Security of One-Time and Many-Time Pads

## Overview
This team project investigates the cryptographic security of the one-time pad, known for its perfect security under certain conditions, and explores the implications of reusing a key, referred to as a many-time pad. We will examine the practical aspects of implementing these encryption methods, analyze their vulnerabilities, and develop strategies to exploit these vulnerabilities in a controlled environment. Here is some background story in signal intelligence. https://en.wikipedia.org/wiki/Venona_project
## Objectives
1. Understand One-Time Pad
Examine the concept of the one-time pad, its implementation, and why it is considered perfectly secure.
2. Implement One-Time Pad Encryption and Decryption
Develop a Python program that simulates the encryption and decryption process between two parties.
3. Explore Many-Time Pad
Extend the one-time pad implementation to simulate a many-time pad scenario, where a single key is used to encrypt multiple messages.
4. Cryptanalysis of Many-Time Pad
Design and execute an attack strategy to decrypt messages encrypted with a many-time pad, focusing on exploiting the vulnerabilities introduced by key reuse.

## Problem 1: Understanding One-Time Pad
Tasks:
1. Research the theoretical basis of the one-time pad, including its requirements and operational principles.

## Problem 2: One-Time Pad Implementation
Tasks: the encryption and decryption process between two parties, Alice and Bob.
1. Alice's Program
Should prompt for a message input (plaintext), then display the ciphertext, and save both the ciphertext (in hex) and the key (in hex) in separate files.
2. Bob's Program:
Should read the key and ciphertext from their respective files and display the decrypted plaintext.

In [10]:
from pathlib import Path
import os
import string

# Print formatted output for better readability
def print_formatted(description:str, data:str):
    print(f'{description:<50}: {data}')

# Create a mailbox directory to store messages
def create_data_directory(directory_name:str) -> Path:
    file_directory = Path(f'./{directory_name}/')
    file_directory.mkdir(exist_ok=True)
    print_formatted('Using data directory', file_directory.name)
    return file_directory

# For the one-time pad, the key must be as long as the plaintext and is different for each message.

def string_to_bytes(text_string:str) -> bytes:
    return text_string.encode('utf-8')

def bytes_to_string(byte_data:bytes) -> str:
    return byte_data.decode('utf-8')

def xor_bytes(data:bytes, key:bytes) -> bytes:
    return bytes(a ^ b for a, b in zip(data, key))

# Generate a random key of a given length
def generate_key(length:int) -> bytes:
    return os.urandom(length)

def one_time_pad_encrypt(plaintext:str) -> tuple:
  key = generate_key(len(plaintext))
  return xor_bytes(string_to_bytes(plaintext), key), key

def one_time_pad_decrypt(ciphertext:bytes, key:bytes) -> bytes:
  return xor_bytes(ciphertext, key)

# Write and read data to/from files
# Project requirements to store data as hex
# Convert string to hex and vice versa
def bytes_to_hex(byte_data: bytes) -> str:
    return byte_data.hex()

def hex_to_bytes(hex_string:str) -> bytes:
    return bytes.fromhex(hex_string)

def write_data(directory_path:Path, file_name:str, message:bytes) -> None:
    data_file = directory_path / f'{file_name}.txt'
    with data_file.open('w') as f:
        f.write(bytes_to_hex(message))

def read_data(directory_path:Path, file_name:str) -> bytes:
    data_file = directory_path / f'{file_name}.txt'
    with data_file.open() as f:
        return hex_to_bytes(f.read())

# Alice and Bob programs
def alice_program(message_name:str, directory_path:Path) -> None:
    # get user input
    plaintext = input('Enter a message: ')
    print_formatted('Plaintext', plaintext)
    # encrypt the message using one-time pad
    ciphertext, key = one_time_pad_encrypt(plaintext)
    print_formatted('Alice sending message (Ciphertext)', ciphertext)
    write_data(directory_path, message_name, ciphertext)
    write_data(directory_path, message_name + '_key', key)

def bob_program(message_name:str, directory_path:Path) -> None:
    ciphertext = read_data(directory_path, message_name)
    key = read_data(directory_path, message_name + '_key')
    # decrypt the message using one-time pad
    message = one_time_pad_decrypt(ciphertext, key)
    print_formatted('Bob received message (Plaintext)', bytes_to_string(message))

# Test the one-time pad implementation for alice and bob
otp_data_directory = create_data_directory('otp_data')
message_file_name = 'message1'
alice_program(message_file_name, otp_data_directory)
bob_program(message_file_name, otp_data_directory)

Using data directory                              : otp_data
Plaintext                                         : Hello bob
Alice sending message (Ciphertext)                : b'v\x1a\xb3\xf9|\x01:\x8b\x1f'
Bob received message (Plaintext)                  : Hello bob


In [20]:

data_file = otp_data_directory / f'{'message1'}.txt'
with data_file.open() as f:
    read_text = f.read()

In [21]:
ciphertext

b'v\x1a\xb3\xf9|\x01:\x8b\x1f'

In [22]:
read_text

'761ab3f97c013a8b1f'

## Problem 3: Exploring Many-Time Pad
Tasks: Modify the one-time pad implementation to encrypt multiple messages with the same key, simulating a many-time pad scenario. The purpose of this problem is to see if there are any recognizable patterns by observing the outputs. You can gain insights by changing the plaintexts or the key to verify your findings. These findings would be useful in the next problem.
1. The program should encrypt a list of 10 predefined plaintext messages with a single key, saving the plaintexts, key, and ciphertexts (all in hex) into a file. You can select 10 of your favorite messages. Assume the key is long enough to do encryption to all the 10 messages.

In [15]:
from pathlib import Path
import os

# Print formatted output for better readability
def print_formatted(description:str, data:str):
    print(f'{description:<50}: {data}')

# Create a mailbox directory to store messages
def create_data_directory(directory_name:str) -> Path:
    file_directory = Path(f'./{directory_name}/')
    file_directory.mkdir(exist_ok=True)
    print_formatted('Using data directory', file_directory.name)
    return file_directory

def string_to_bytes(text_string:str) -> bytes:
    return text_string.encode('utf-8')

def bytes_to_string(byte_data:bytes) -> str:
    return byte_data.decode('utf-8')

# Generate a random key of a given length
def generate_key(length:int) -> bytes:
    return os.urandom(length)

def xor_bytes(data:bytes, key:bytes) -> bytes:
    return bytes(a ^ b for a, b in zip(data, key))

# For storing and retreiving from file in hex
def string_to_hex(string:str) -> str:
    return string.encode('utf-8').hex()

def hex_to_string(hex_string:str) -> str:
    return bytes.fromhex(hex_string).decode('utf-8')

# Many-time pad encryption for multiple plaintexts using the same key
def many_time_pad_encrypt(plaintexts: list, key: bytes) -> list:
    ciphertexts = []
    for plaintext in plaintexts:
        ciphertexts.append(xor_bytes(string_to_bytes(plaintext), key))
    return ciphertexts

# Many-time pad decryption for multiple ciphertexts using the same key
def many_time_pad_decrypt(ciphertexts: list, keys: list) -> list:
    plaintexts = []
    for key, ciphertext in zip(keys, ciphertexts):
        plaintexts.append(bytes_to_string(xor_bytes(ciphertext, key)))
    return plaintexts

# Write and read many-time pad data to/from files
# The format of the files is key,plaintext,ciphertext
# The key, plaintext, and ciphertext are in hex format

# Write and read data to/from files
# Project requirements to store data as hex
# Convert string to hex and vice versa
def bytes_to_hex(byte_data: bytes) -> str:
    return byte_data.hex()

def hex_to_bytes(hex_string:str) -> bytes:
    return bytes.fromhex(hex_string)

def write_many_time_pad_data(directory_path: Path, file_name:str, key: bytes, plaintexts:list, ciphertexts: list) -> None:
    data_file = directory_path / f'{file_name}.txt'
    with data_file.open('w') as f:
        for plaintext, ciphertext in zip(plaintexts, ciphertexts):
            f.write(f'{bytes_to_hex(key)},{string_to_hex(plaintext)},{bytes_to_hex(ciphertext)}\n')

def read_many_time_pad_data(directory_path: Path, file_name:str) -> tuple:
    keys = []
    plaintexts = []
    ciphertexts = []
    data_file = directory_path / f'{file_name}.txt'
    with data_file.open() as f:
        for line in f:
            k, p, c = line.split(',')
            keys.append(hex_to_bytes(k.strip()))
            plaintexts.append(hex_to_string(p.strip()))
            ciphertexts.append(hex_to_bytes(c.strip()))
    return keys, plaintexts, ciphertexts

# Many-time pad encryption and decryption for multiple messages
# 10 even longer messages to test
mtp_messages = [
    'hello world, how are you doing today?',
    'python programming is fun and challenging',
    'data science is the future of technology',
    'machine learning is a subset of artificial intelligence',
    'deep neural network is a type of machine learning',
    'cryptography is the practice and study of secure communication',
    'information security is the protection of information',
    'computer science is the study of computation',
    'software engineering is the application of engineering to software',
    'data analytics is the science of analyzing raw data'
    ]

mtp_key = generate_key(100)
mtp_ciphertexts = many_time_pad_encrypt(mtp_messages, mtp_key)
many_time_pad_data_directory = create_data_directory('mtp_data')
many_time_pad_file_name = 'many_time_pad_data'
write_many_time_pad_data(many_time_pad_data_directory, many_time_pad_file_name, mtp_key, mtp_messages, mtp_ciphertexts)
mtp_keys, mtp_plaintexts, mtp_ciphertexts = read_many_time_pad_data(many_time_pad_data_directory, many_time_pad_file_name)
print("Many-Time Pad Data:")
for i in range(len(mtp_messages)):
    print_formatted('Message', str(i+1))
    print_formatted('Many-Time Pad Key', mtp_keys[i])
    print_formatted('Many-Time Pad Plaintext', mtp_plaintexts[i])
    print_formatted('Many-Time Pad Ciphertext', mtp_ciphertexts[i])

print("Many-Time Pad Data decrypted:")
mtp_plaintexts_2 = many_time_pad_decrypt(mtp_ciphertexts, mtp_keys)
assert mtp_plaintexts == mtp_plaintexts_2
for plaintext in mtp_plaintexts_2:
    print_formatted('Many-Time Pad Decrypted Plaintext', plaintext)

Using data directory                              : mtp_data
Many-Time Pad Data:
Message                                           : 1
Many-Time Pad Key                                 : b"\xa9mdZ\xd4\xed\xfe\x17\xec\xafb\x16K\xa8'zKD=\x16\xb27\xa1\x94eLf\xc7&\xb1Dm\x81\x0c\x954Y\x05N\xc7\xd2\xc2g\xb2N\xc2\xc6J\xed\x90\xd1\xb3%\x83\x8b\x8b\xefp\x8ez\xc6Vdg%?\xff\xc7/\x01\x8b\xa9Zy\xf0\xa3\xea \xf4\xaf\x9c\n\xc7\x8f0Gg_U\x16\xe1\x08H\xb8\xa1M?\xe3Y\xcb"
Many-Time Pad Plaintext                           : hello world, how are you doing today?
Many-Time Pad Ciphertext                          : b'\xc1\x08\x086\xbb\xcd\x89x\x9e\xc3\x06:k\xc0H\rk%Os\x92N\xce\xe1E(\t\xaeH\xd6d\x19\xeeh\xf4Mf'
Message                                           : 2
Many-Time Pad Key                                 : b"\xa9mdZ\xd4\xed\xfe\x17\xec\xafb\x16K\xa8'zKD=\x16\xb27\xa1\x94eLf\xc7&\xb1Dm\x81\x0c\x954Y\x05N\xc7\xd2\xc2g\xb2N\xc2\xc6J\xed\x90\xd1\xb3%\x83\x8b\x8b\xefp\x8ez\xc6Vdg%?\xff\xc7/\x01\x8b\xa9Zy\x

In [33]:
from collections import Counter

def string_to_bytes(text_string:str) -> bytes:
    return text_string.encode('utf-8')

def bytes_to_string(byte_data:bytes) -> str:
    return byte_data.decode('utf-8')

# Generate a random key of a given length
def generate_key(length:int) -> bytes:
    return os.urandom(length)

def xor_bytes(data:bytes, key:bytes) -> bytes:
    return bytes(a ^ b for a, b in zip(data, key))

# For storing and retreiving from file in hex
def string_to_hex(string:str) -> str:
    return string.encode('utf-8').hex()

def hex_to_string(hex_string:str) -> str:
    return bytes.fromhex(hex_string).decode('utf-8')

collected_cypertexts = [
  "71fe1ace4389087266117cd7c98c4182851b3acff3b086e3f83f94d6eb05c4ba85d8e1fa14f11d1c3b568ff6cff5c09c5d67ef5c9c71b7eeb3d45a5154ab17b83e071ce9d8988adb4afedf46a840",
  "71fe1ace559a1e7266117cd7ce8745d7be2e74c3f0f68eeef57e8884e607debf81dfa0f012f95819681ae7f29fe4839b5175ef5e8760bef0b9d44b504eba12b22f5404f89dd085d550a48865a14f9b15a94dabe609ca2df2cccf210cefdb1af5389719795e1f0179cb77c5c456954d88f3",
  "72fe069c51c81a20775928c7879d4fd2a93c3acff3f69fe5fe2e9493a303d9ea98c4e5b60ae40a146058e7c787fbd09a1474e25dc865b5e6af865d4a40a61bfd384e06e0cfc1ccd356ff8853ac438905fa5fe3fd41cb3bbc8ac9",
  "67e543885b9a5b2267177084cf8453ccb8633ad7fdb39de5b13f8a93a304d6bf8bc4f4ef5def110b6f56a3e186e2c68c1470ef5c9c2ffbd6a291571e40ba1afd3b4b1fe0c4cbccc15df5dc07b043da01fa6ae4fd158f37b3c0cd",
  "71fe029a148c1236320d7192878a59cfbc3a6ec5e7f68befb13196d6ea1ec4ea81d9e3fe50ea0f196d02a2f7cfe2c29c5577e35d8630baf6ea80465b01aa1abc394f57a1f4ccccda59ff8846e44b8805bb5cabe608c231f2dec8364ae7d90ab4358c5c3a421b06",
  "6ef914ce5989152b321a769ad79c42c7be6f6ad2fab19de1fc339d84f04ad3a589dfa0ff09ab0c196f13e7e780b4c097556ded57c871fbeea393464a01aa0ab1381848cfd2d6898918efc046b00b8940bb08e3f313cb23b3dfd8645cfcd80ff82489",
  "71fe1ace4389087266117cd7c4865bd2b93b7fd2b5a58ce9f4308c9ff01e97ab82cbf2ef5dfc101d6a56b3fb8ab4d08b4167ef5c9c30b8f0ab97455b45e81efd364605e49ddb83df48eedc42b60c900fb14db4b229ca74b6c4d96442e1c34df8288f5c3a450a527ecc7c82865b8e",
  "71fe029a148c1437615978d7c58854dbec2c75cde5a39be5e37e9b97ef0697a285dfa0f01cff101d764983f29bf5",
  "71fe1ace50875b31730d6ad7cb8640c7ec3c73d4e1bf81e7b13796d6e518d8a4988ceff05dff101d2415a8fe9fe1d79a4623eb5e8430bfe3b3d442514faf40fd18420be0c8cb89924cf3cd5ee448950efd5cabe500c120f2d9d26440ebc34de029811977430b01748276d79012955cc6a65aebb9054becda5c9278"
]

def print_hex_only(description: str, data: bytes):
    hex_data = data.hex()
    hex_data = ''.join(['\\x' + (hex_data[i:i+2]) for i in range(0, len(hex_data), 2)])
    print(f'{description:<50}: {hex_data}')

# Example usage
ciphertext = b'\x48\x65\x6c\x6c\x6f'
print_formatted('Ciphertext', ciphertext)

letter_freq = Counter()
max_len = 0
byte_texts = []
i = 1
for c_text in collected_cypertexts:
  print(f'hex       {i}: {c_text}')
  print(f'bytes     {i}: {hex_to_bytes(c_text)}')
  max_len = max(max_len, len(c_text))
  byte_texts.append(hex_to_bytes(c_text))
  i += 1

print(f'Max length of key: {max_len}')
print_hex_only("Bytes in hex", byte_texts[8])

print(letter_freq)

print(byte_texts[7])
print(byte_texts[8])
byte_text_1 = byte_texts[7]
print(byte_text_1)
print(len(byte_text_1))
byte_text_2 = byte_texts[8][0:len(byte_text_1)]
print(byte_text_1)
print(byte_text_2)

one_time_pad = xor_bytes(byte_text_1, byte_text_2)
print(f'One time pad: {one_time_pad}')
xor_bytes(byte_text_1, one_time_pad)

Ciphertext                                        : b'Hello'
hex       1: 71fe1ace4389087266117cd7c98c4182851b3acff3b086e3f83f94d6eb05c4ba85d8e1fa14f11d1c3b568ff6cff5c09c5d67ef5c9c71b7eeb3d45a5154ab17b83e071ce9d8988adb4afedf46a840
bytes     1: b'q\xfe\x1a\xceC\x89\x08rf\x11|\xd7\xc9\x8cA\x82\x85\x1b:\xcf\xf3\xb0\x86\xe3\xf8?\x94\xd6\xeb\x05\xc4\xba\x85\xd8\xe1\xfa\x14\xf1\x1d\x1c;V\x8f\xf6\xcf\xf5\xc0\x9c]g\xef\\\x9cq\xb7\xee\xb3\xd4ZQT\xab\x17\xb8>\x07\x1c\xe9\xd8\x98\x8a\xdbJ\xfe\xdfF\xa8@'
hex       2: 71fe1ace559a1e7266117cd7ce8745d7be2e74c3f0f68eeef57e8884e607debf81dfa0f012f95819681ae7f29fe4839b5175ef5e8760bef0b9d44b504eba12b22f5404f89dd085d550a48865a14f9b15a94dabe609ca2df2cccf210cefdb1af5389719795e1f0179cb77c5c456954d88f3
bytes     2: b'q\xfe\x1a\xceU\x9a\x1erf\x11|\xd7\xce\x87E\xd7\xbe.t\xc3\xf0\xf6\x8e\xee\xf5~\x88\x84\xe6\x07\xde\xbf\x81\xdf\xa0\xf0\x12\xf9X\x19h\x1a\xe7\xf2\x9f\xe4\x83\x9bQu\xef^\x87`\xbe\xf0\xb9\xd4KPN\xba\x12\xb2/T\x04\xf8\x9d\xd0\x85\xd5P\xa4\x88e\xa1O\x9b

b'q\xfe\x1a\xceP\x87[1s\rj\xd7\xcb\x86@\xc7\xec<s\xd4\xe1\xbf\x81\xe7\xb17\x96\xd6\xe5\x18\xd8\xa4\x98\x8c\xef\xf0]\xff\x10\x1d$\x15\xa8\xfe\x9f\xe1'