# Exploring the Security of One-Time and Many-Time Pads

## Overview
This team project investigates the cryptographic security of the one-time pad, known for its perfect security under certain conditions, and explores the implications of reusing a key, referred to as a many-time pad. We will examine the practical aspects of implementing these encryption methods, analyze their vulnerabilities, and develop strategies to exploit these vulnerabilities in a controlled environment. Here is some background story in signal intelligence. https://en.wikipedia.org/wiki/Venona_project
## Objectives
1. Understand One-Time Pad
Examine the concept of the one-time pad, its implementation, and why it is considered perfectly secure.
2. Implement One-Time Pad Encryption and Decryption
Develop a Python program that simulates the encryption and decryption process between two parties.
3. Explore Many-Time Pad
Extend the one-time pad implementation to simulate a many-time pad scenario, where a single key is used to encrypt multiple messages.
4. Cryptanalysis of Many-Time Pad
Design and execute an attack strategy to decrypt messages encrypted with a many-time pad, focusing on exploiting the vulnerabilities introduced by key reuse.

## Problem 1: Understanding One-Time Pad
Tasks:
1. Research the theoretical basis of the one-time pad, including its requirements and operational principles.

## Problem 2: One-Time Pad Implementation
Tasks: the encryption and decryption process between two parties, Alice and Bob.
1. Alice's Program
Should prompt for a message input (plaintext), then display the ciphertext, and save both the ciphertext (in hex) and the key (in hex) in separate files.
2. Bob's Program:
Should read the key and ciphertext from their respective files and display the decrypted plaintext.

In [10]:
from pathlib import Path
import os
import string

# Print formatted output for better readability
def print_formatted(description:str, data:str):
    print(f'{description:<50}: {data}')

# Create a mailbox directory to store messages
def create_data_directory(directory_name:str) -> Path:
    file_directory = Path(f'./{directory_name}/')
    file_directory.mkdir(exist_ok=True)
    print_formatted('Using data directory', file_directory.name)
    return file_directory

# For the one-time pad, the key must be as long as the plaintext and is different for each message.

def string_to_bytes(text_string:str) -> bytes:
    return text_string.encode('utf-8')

def bytes_to_string(byte_data:bytes) -> str:
    return byte_data.decode('utf-8')

def xor_bytes(data:bytes, key:bytes) -> bytes:
    return bytes(a ^ b for a, b in zip(data, key))

# Generate a random key of a given length
def generate_key(length:int) -> bytes:
    return os.urandom(length)

def one_time_pad_encrypt(plaintext:str) -> tuple:
  key = generate_key(len(plaintext))
  return xor_bytes(string_to_bytes(plaintext), key), key

def one_time_pad_decrypt(ciphertext:bytes, key:bytes) -> bytes:
  return xor_bytes(ciphertext, key)

# Write and read data to/from files
# Project requirements to store data as hex
# Convert string to hex and vice versa
def bytes_to_hex(byte_data: bytes) -> str:
    return byte_data.hex()

def hex_to_bytes(hex_string:str) -> bytes:
    return bytes.fromhex(hex_string)

def write_data(directory_path:Path, file_name:str, message:bytes) -> None:
    data_file = directory_path / f'{file_name}.txt'
    with data_file.open('w') as f:
        f.write(bytes_to_hex(message))

def read_data(directory_path:Path, file_name:str) -> bytes:
    data_file = directory_path / f'{file_name}.txt'
    with data_file.open() as f:
        return hex_to_bytes(f.read())

# Alice and Bob programs
def alice_program(message_name:str, directory_path:Path) -> None:
    # get user input
    plaintext = input('Enter a message: ')
    print_formatted('Plaintext', plaintext)
    # encrypt the message using one-time pad
    ciphertext, key = one_time_pad_encrypt(plaintext)
    print_formatted('Alice sending message (Ciphertext)', ciphertext)
    write_data(directory_path, message_name, ciphertext)
    write_data(directory_path, message_name + '_key', key)

def bob_program(message_name:str, directory_path:Path) -> None:
    ciphertext = read_data(directory_path, message_name)
    key = read_data(directory_path, message_name + '_key')
    # decrypt the message using one-time pad
    message = one_time_pad_decrypt(ciphertext, key)
    print_formatted('Bob received message (Plaintext)', bytes_to_string(message))

# Test the one-time pad implementation for alice and bob
otp_data_directory = create_data_directory('otp_data')
message_file_name = 'message1'
alice_program(message_file_name, otp_data_directory)
bob_program(message_file_name, otp_data_directory)

Using data directory                              : otp_data
Plaintext                                         : Hello bob
Alice sending message (Ciphertext)                : b'v\x1a\xb3\xf9|\x01:\x8b\x1f'
Bob received message (Plaintext)                  : Hello bob


## Problem 3: Exploring Many-Time Pad
Tasks: Modify the one-time pad implementation to encrypt multiple messages with the same key, simulating a many-time pad scenario. The purpose of this problem is to see if there are any recognizable patterns by observing the outputs. You can gain insights by changing the plaintexts or the key to verify your findings. These findings would be useful in the next problem.
1. The program should encrypt a list of 10 predefined plaintext messages with a single key, saving the plaintexts, key, and ciphertexts (all in hex) into a file. You can select 10 of your favorite messages. Assume the key is long enough to do encryption to all the 10 messages.

In [8]:
from pathlib import Path
import os

# Print formatted output for better readability
def print_formatted(description:str, data:str):
    print(f'{description:<50}: {data}')

# Create a mailbox directory to store messages
def create_data_directory(directory_name:str) -> Path:
    file_directory = Path(f'./{directory_name}/')
    file_directory.mkdir(exist_ok=True)
    print_formatted('Using data directory', file_directory.name)
    return file_directory

def string_to_bytes(text_string:str) -> bytes:
    return text_string.encode('utf-8')

def bytes_to_string(byte_data:bytes) -> str:
    return byte_data.decode('utf-8')

# Generate a random key of a given length
def generate_key(length:int) -> bytes:
    return os.urandom(length)

def xor_bytes(data:bytes, key:bytes) -> bytes:
    return bytes(a ^ b for a, b in zip(data, key))

# For storing and retreiving from file in hex
def string_to_hex(string:str) -> str:
    return string.encode('utf-8').hex()

def hex_to_string(hex_string:str) -> str:
    return bytes.fromhex(hex_string).decode('utf-8')

# Many-time pad encryption for multiple plaintexts using the same key
def many_time_pad_encrypt(plaintexts: list, key: bytes) -> list:
    ciphertexts = []
    for plaintext in plaintexts:
        ciphertexts.append(xor_bytes(string_to_bytes(plaintext), key))
    return ciphertexts

# Many-time pad decryption for multiple ciphertexts using the same key
def many_time_pad_decrypt(ciphertexts: list, keys: list) -> list:
    plaintexts = []
    for key, ciphertext in zip(keys, ciphertexts):
        plaintexts.append(bytes_to_string(xor_bytes(ciphertext, key)))
    return plaintexts

# Write and read many-time pad data to/from files
# The format of the files is key,plaintext,ciphertext
# The key, plaintext, and ciphertext are in hex format

# Write and read data to/from files
# Project requirements to store data as hex
# Convert string to hex and vice versa
def bytes_to_hex(byte_data: bytes) -> str:
    return byte_data.hex()

def hex_to_bytes(hex_string:str) -> bytes:
    return bytes.fromhex(hex_string)

def write_many_time_pad_data(directory_path: Path, file_name:str, key: bytes, plaintexts:list, ciphertexts: list) -> None:
    data_file = directory_path / f'{file_name}.txt'
    with data_file.open('w') as f:
        for plaintext, ciphertext in zip(plaintexts, ciphertexts):
            f.write(f'{bytes_to_hex(key)},{string_to_hex(plaintext)},{bytes_to_hex(ciphertext)}\n')

def read_many_time_pad_data(directory_path: Path, file_name:str) -> tuple:
    keys = []
    plaintexts = []
    ciphertexts = []
    data_file = directory_path / f'{file_name}.txt'
    with data_file.open() as f:
        for line in f:
            k, p, c = line.split(',')
            keys.append(hex_to_bytes(k.strip()))
            plaintexts.append(hex_to_string(p.strip()))
            ciphertexts.append(hex_to_bytes(c.strip()))
    return keys, plaintexts, ciphertexts

# Many-time pad encryption and decryption for multiple messages
# 10 even longer messages to test
mtp_messages = [
    'hello world, how are you doing today?',
    'python programming is fun and challenging',
    'data science is the future of technology',
    'machine learning is a subset of artificial intelligence',
    'deep neural network is a type of machine learning',
    'cryptography is the practice and study of secure communication',
    'information security is the protection of information',
    'computer science is the study of computation',
    'software engineering is the application of engineering to software',
    'data analytics is the science of analyzing raw data'
    ]

mtp_key = generate_key(100)
mtp_ciphertexts = many_time_pad_encrypt(mtp_messages, mtp_key)
many_time_pad_data_directory = create_data_directory('mtp_data')
many_time_pad_file_name = 'many_time_pad_data'
write_many_time_pad_data(many_time_pad_data_directory, many_time_pad_file_name, mtp_key, mtp_messages, mtp_ciphertexts)
mtp_keys, mtp_plaintexts, mtp_ciphertexts = read_many_time_pad_data(many_time_pad_data_directory, many_time_pad_file_name)
print("Many-Time Pad Data:")
for i in range(len(mtp_messages)):
    print_formatted('Message', str(i+1))
    print_formatted('Many-Time Pad Key', mtp_keys[i])
    print_formatted('Many-Time Pad Plaintext', mtp_plaintexts[i])
    print_formatted('Many-Time Pad Ciphertext', mtp_ciphertexts[i])

print("Many-Time Pad Data decrypted:")
mtp_plaintexts_2 = many_time_pad_decrypt(mtp_ciphertexts, mtp_keys)
assert mtp_plaintexts == mtp_plaintexts_2
for plaintext in mtp_plaintexts_2:
    print_formatted('Many-Time Pad Decrypted Plaintext', plaintext)

Using data directory                              : mtp_data
Many-Time Pad Data:
Message                                           : 1
Many-Time Pad Key                                 : b"\xf0\xfc\xc3o\x1f*^,m\xb1\xd5H\xa3\x9bY\xf7\xf6{\xc8C>oX\xf1%\xa7\xdc\xf6\x91['\xeato\xba\xd2\x961vTS\xd6>X\x82\xaa\xd3\x0b\r\xa34\xcb\xe7g\x83.\x00\x8f\xdeD\xd13\x96m\xa2\x10@e\x89Q*S?\xe5w\x14\xfb\xda\xfa\xe5BIZ~\xde\xc3\xa1Q\xa8\xce\x8d\x7f\xe4\xf4[\x81\x9a\x8b\x01+"
Many-Time Pad Plaintext                           : hello world, how are you doing today?
Many-Time Pad Ciphertext                          : b'\x98\x99\xaf\x03p\n)C\x1f\xdd\xb1d\x83\xf36\x80\xd6\x1a\xba&\x1e\x167\x84\x05\xc3\xb3\x9f\xff<\x07\x9e\x1b\x0b\xdb\xab\xa9'
Message                                           : 2
Many-Time Pad Key                                 : b"\xf0\xfc\xc3o\x1f*^,m\xb1\xd5H\xa3\x9bY\xf7\xf6{\xc8C>oX\xf1%\xa7\xdc\xf6\x91['\xeato\xba\xd2\x961vTS\xd6>X\x82\xaa\xd3\x0b\r\xa34\xcb\xe7g\x83.\x00\x8f\xdeD\xd13\x

In [148]:
from collections import Counter

# Print formatted output for better readability
def print_formatted(description:str, data:str):
    print(f'{description:<50}: {data}')

def bytes_to_hex(byte_data: bytes) -> str:
    return byte_data.hex()

def hex_to_bytes(hex_string:str) -> bytes:
    return bytes.fromhex(hex_string)

def string_to_bytes(text_string:str) -> bytes:
    return text_string.encode('utf-8')

def bytes_to_string(byte_data:bytes) -> str:
    return byte_data.decode('utf-8')

# Generate a random key of a given length
def generate_key(length:int) -> bytes:
    return os.urandom(length)

def xor_bytes(data:bytes, key:bytes) -> bytes:
    return bytes(a ^ b for a, b in zip(data, key))

# For storing and retreiving from file in hex
def string_to_hex(string:str) -> str:
    return string.encode('utf-8').hex()

def hex_to_string(hex_string:str) -> str:
    return bytes.fromhex(hex_string).decode('utf-8')

collected_cyphertexts = [
  "71fe1ace4389087266117cd7c98c4182851b3acff3b086e3f83f94d6eb05c4ba85d8e1fa14f11d1c3b568ff6cff5c09c5d67ef5c9c71b7eeb3d45a5154ab17b83e071ce9d8988adb4afedf46a840",
  "71fe1ace559a1e7266117cd7ce8745d7be2e74c3f0f68eeef57e8884e607debf81dfa0f012f95819681ae7f29fe4839b5175ef5e8760bef0b9d44b504eba12b22f5404f89dd085d550a48865a14f9b15a94dabe609ca2df2cccf210cefdb1af5389719795e1f0179cb77c5c456954d88f3",
  "72fe069c51c81a20775928c7879d4fd2a93c3acff3f69fe5fe2e9493a303d9ea98c4e5b60ae40a146058e7c787fbd09a1474e25dc865b5e6af865d4a40a61bfd384e06e0cfc1ccd356ff8853ac438905fa5fe3fd41cb3bbc8ac9",
  "67e543885b9a5b2267177084cf8453ccb8633ad7fdb39de5b13f8a93a304d6bf8bc4f4ef5def110b6f56a3e186e2c68c1470ef5c9c2ffbd6a291571e40ba1afd3b4b1fe0c4cbccc15df5dc07b043da01fa6ae4fd158f37b3c0cd",
  "71fe029a148c1236320d7192878a59cfbc3a6ec5e7f68befb13196d6ea1ec4ea81d9e3fe50ea0f196d02a2f7cfe2c29c5577e35d8630baf6ea80465b01aa1abc394f57a1f4ccccda59ff8846e44b8805bb5cabe608c231f2dec8364ae7d90ab4358c5c3a421b06",
  "6ef914ce5989152b321a769ad79c42c7be6f6ad2fab19de1fc339d84f04ad3a589dfa0ff09ab0c196f13e7e780b4c097556ded57c871fbeea393464a01aa0ab1381848cfd2d6898918efc046b00b8940bb08e3f313cb23b3dfd8645cfcd80ff82489",
  "71fe1ace4389087266117cd7c4865bd2b93b7fd2b5a58ce9f4308c9ff01e97ab82cbf2ef5dfc101d6a56b3fb8ab4d08b4167ef5c9c30b8f0ab97455b45e81efd364605e49ddb83df48eedc42b60c900fb14db4b229ca74b6c4d96442e1c34df8288f5c3a450a527ecc7c82865b8e",
  "71fe029a148c1437615978d7c58854dbec2c75cde5a39be5e37e9b97ef0697a285dfa0f01cff101d764983f29bf5",
  "71fe1ace50875b31730d6ad7cb8640c7ec3c73d4e1bf81e7b13796d6e518d8a4988ceff05dff101d2415a8fe9fe1d79a4623eb5e8430bfe3b3d442514faf40fd18420be0c8cb89924cf3cd5ee448950efd5cabe500c120f2d9d26440ebc34de029811977430b01748276d79012955cc6a65aebb9054becda5c9278"
]

max_len = 0
byte_texts = []
for c_text in collected_cyphertexts:
    byte_text = hex_to_bytes(c_text)
    print_formatted("Hex:", byte_text)
    max_len = max(max_len, len(c_text))
    byte_texts.append(byte_text)

print(f'Max length of key: {max_len}')
source_ciphertexts = byte_texts[0:7]
target_ciphertext = byte_texts[8]


Hex:                                              : b'q\xfe\x1a\xceC\x89\x08rf\x11|\xd7\xc9\x8cA\x82\x85\x1b:\xcf\xf3\xb0\x86\xe3\xf8?\x94\xd6\xeb\x05\xc4\xba\x85\xd8\xe1\xfa\x14\xf1\x1d\x1c;V\x8f\xf6\xcf\xf5\xc0\x9c]g\xef\\\x9cq\xb7\xee\xb3\xd4ZQT\xab\x17\xb8>\x07\x1c\xe9\xd8\x98\x8a\xdbJ\xfe\xdfF\xa8@'
Hex:                                              : b'q\xfe\x1a\xceU\x9a\x1erf\x11|\xd7\xce\x87E\xd7\xbe.t\xc3\xf0\xf6\x8e\xee\xf5~\x88\x84\xe6\x07\xde\xbf\x81\xdf\xa0\xf0\x12\xf9X\x19h\x1a\xe7\xf2\x9f\xe4\x83\x9bQu\xef^\x87`\xbe\xf0\xb9\xd4KPN\xba\x12\xb2/T\x04\xf8\x9d\xd0\x85\xd5P\xa4\x88e\xa1O\x9b\x15\xa9M\xab\xe6\t\xca-\xf2\xcc\xcf!\x0c\xef\xdb\x1a\xf58\x97\x19y^\x1f\x01y\xcbw\xc5\xc4V\x95M\x88\xf3'
Hex:                                              : b'r\xfe\x06\x9cQ\xc8\x1a wY(\xc7\x87\x9dO\xd2\xa9<:\xcf\xf3\xf6\x9f\xe5\xfe.\x94\x93\xa3\x03\xd9\xea\x98\xc4\xe5\xb6\n\xe4\n\x14`X\xe7\xc7\x87\xfb\xd0\x9a\x14t\xe2]\xc8e\xb5\xe6\xaf\x86]J@\xa6\x1b\xfd8N\x06\xe0\xcf\xc1\xcc\xd3V\xff\x88

In [199]:
import string
import re
import nltk
from nltk.corpus import words

# regex pattern for allowable characters
pattern = r'^[A-Za-z0-9 .?]+$'

def contains_allowable_characters(bytes: bytes) -> bool:
    return bool(re.search(pattern, bytes.decode('latin1')))

# Download the word list if not already downloaded
nltk.download('words')

# Load the word list
word_list = set(words.words())

# Check if the word or phrase is part of a real word
def is_part_of_real_word(bytestring: bytes) -> bool:
    word_or_phrase = bytestring.decode('latin1').split(' ')
    for word in word_or_phrase:
        if word.lower() not in word_list:
            return False
    return True

def evaluate_possible_plaintext(possible_plaintext: bytes) -> bool:
    return contains_allowable_characters(possible_plaintext) and is_part_of_real_word(possible_plaintext)

# Update the value of the key at a specific position
def set_key(key: list, position: int, value: bytes) -> list:
    for i in range(len(value)):
        key[position + i] = value[i]
    return key

def set_key_from(key: bytes, position: int, ciphertext: bytes, value: bytes) -> bytes:
    new_key = xor_bytes(ciphertext[position:position + len(value)], value)
    return bytes(set_key(list(key), position, new_key))

def crib_drag(key, ciphertext: bytes, ciphertexts: bytes, crib: bytes) -> (bytes, list):
    new_key = list(key)
    crib_length = len(crib)
    found_plaintexts = []
    for ct_num, next_ciphertext in enumerate(ciphertexts):
        if ciphertext == next_ciphertext:
            continue

        min_len = min(len(ciphertext), len(next_ciphertext))
        # Drag the crib
        for i in range(min_len - crib_length + 1):
            # If you already have this key position filled out, skip
            if new_key[i] != 0:
                continue

            # Get the segment of the ciphertext from the target and source
            segment1 = ciphertext[i:i + crib_length]
            segment2 = next_ciphertext[i:i + crib_length]

            # Skip if the segments are the same, this will result in a xor of 0x00
            if segment1 == segment2:
                continue

            xor_segment = xor_bytes(segment1, segment2)
            possible_plaintext = xor_bytes(xor_segment, crib)

            # Generate the possible key from the crib against the source and the target. Since we don't
            # know if the crib is in the target or the source, we need to check both possibilities.
            # Evaluate the possible key against all the source texts. If the key is good, update the key.
            # If not, update the key with the source text xor'd with the crib
            good_key = True
            if evaluate_possible_plaintext(possible_plaintext):
                found_plaintexts.append(f'Position {i}: in ct {ct_num} for {crib} {possible_plaintext}')
                possible_key = xor_bytes(segment1, crib)
                for ct in ciphertexts:
                    test_text = xor_bytes(ct[i:i + crib_length], possible_key)
                    if not evaluate_possible_plaintext(test_text):
                        good_key = False
                        break
                if good_key:
                    new_key = set_key(new_key, i, possible_key)
                else:
                    new_key = set_key(new_key, i, xor_bytes(segment2, crib))



    return bytes(new_key), found_plaintexts

# Example usage
key = bytes(max_len)

key, results = crib_drag(key, target_ciphertext, source_ciphertexts, b'There ')
key, results = crib_drag(key, target_ciphertext, source_ciphertexts, b'Why ')
key, results = crib_drag(key, target_ciphertext, source_ciphertexts, b' ')
key, results = crib_drag(key, target_ciphertext, source_ciphertexts, b' cat')
key, results = crib_drag(key, target_ciphertext, source_ciphertexts, b' and ')
key, results = crib_drag(key, target_ciphertext, source_ciphertexts, b' why ')
key, results = crib_drag(key, target_ciphertext, source_ciphertexts, b' how ')
key, results = crib_drag(key, target_ciphertext, source_ciphertexts, b' then ')
key, results = crib_drag(key, target_ciphertext, source_ciphertexts, b' there ')
key, results = crib_drag(key, target_ciphertext, source_ciphertexts, b' they ')

key = set_key_from(key, 4, target_ciphertext, b'does ')


print(key)
for result in results:
    print(result)

print(xor_bytes(key, target_ciphertext))
for ciphertext in source_ciphertexts:
    print(xor_bytes(key, ciphertext))



[nltk_data] Downloading package words to
[nltk_data]     C:\Users\nmaiorana\AppData\Roaming\nltk_data...
[nltk_data]   Package words is already up-to-date!


b"&\x96c\xee4\xe8>BSy\x08\xa4\xa7\x00\x00\xa2\xccO\x1a\x00\x95\xd6\x00\x00\xd8^\x00\xa4\x83j\xb7\xca\x00\xac\x80\x964\x8bx\x00Hv\xc7\x00\xef\x94\xa3\x004\x03\x00\x00\xe8Q\xdb\x00\xca\xa6\x00>!\xc87\x98\x00'h\x81\xbd\xb8\xec\xfb8\x84\xa8'\x88\x1c\xe0|\x9b\x08\x8b\x84a\xafT\x9c\xaa\x00\x01,\x00\x00:\x94\x00\x00|\x1a\x00\x00r\x00\xeb\x00\xa2\xe4v\x00\x00\xa8\xd3\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00"
b'Why does tbsl\x86@e si\xd4ti\x81\xe7ii\x96rfron\x98 ofith\x1dlco\xfeput\x9ar \xeb^lad\xe3yrBongwe\x18ec

In [130]:
import re

def contains_only_allowable_characters(s: str) -> bool:
    pattern = r'^[A-Za-z0-9 .?]+$'
    return bool(re.match(pattern, s))

# Example usage
test_string = "Hello World 123.?"
print(contains_only_allowable_characters(test_string))  # Output: True

test_string = "helloworld"
print(contains_only_allowable_characters(test_string))  # Output: True

test_string = "hel."
print(contains_only_allowable_characters(test_string))  # Output: False

True
True
True


In [136]:
import nltk
from nltk.corpus import words

# Download the word list if not already downloaded
nltk.download('words')

# Load the word list
word_list = set(words.words())

def is_real_word(word: str) -> bool:
    return word.lower() in word_list

# Example usage
test_word = "hello"
print(is_real_word(test_word))  # Output: True

test_word = "llo"
print(is_real_word(test_word))  # Output: False

[nltk_data] Downloading package words to
[nltk_data]     C:\Users\nmaiorana\AppData\Roaming\nltk_data...
[nltk_data]   Package words is already up-to-date!


True
False


In [97]:
key = bytes(max_len)
crib = b'f'
start_char = 3
end_char = start_char + 1
ct1 = byte_texts[0][start_char:end_char]
ct2 = byte_texts[3][start_char:end_char]
print(ct1)
print(ct2)
possible_pt = xor_bytes(ct1, ct2)
possible_pt = xor_bytes(possible_pt, crib)
print(possible_pt)

b'\xce'
b'\x88'
b' '


In [115]:
def crib_drag(ciphertexts, crib):
    """
    Perform crib dragging on a list of ciphertexts using a given crib.

    Args:
        ciphertexts (list): List of ciphertexts (byte strings).
        crib (bytes): The known plaintext (crib) to be used in the attack.

    Returns:
        list: List of possible plaintexts for each ciphertext.
    """
    possible_plaintexts = []

    for i in range(len(ciphertexts)):
        possible_plaintext = []

        for j in range(len(ciphertexts[i]) - len(crib) + 1):
            xor_segment = bytes(a ^ b for a, b in zip(ciphertexts[i][j:j + len(crib)], crib))
            possible_plaintext.append((j, xor_segment))

        possible_plaintexts.append(possible_plaintext)

    return possible_plaintexts

# Example usage
crib = b' there '

results = crib_drag(byte_texts, crib)

for i, result in enumerate(results):
    print(f"Ciphertext {i + 1}:")
    for position, line in result:
        for k, pstring in enumerate(line):
            print(f"Position {position}: {k}, {pstring}")



Ciphertext 1:
Position 0: 0, 81
Position 0: 1, 138
Position 0: 2, 114
Position 0: 3, 171
Position 0: 4, 49
Position 0: 5, 236
Position 0: 6, 40
Position 1: 0, 222
Position 1: 1, 110
Position 1: 2, 166
Position 1: 3, 38
Position 1: 4, 251
Position 1: 5, 109
Position 1: 6, 82
Position 2: 0, 58
Position 2: 1, 186
Position 2: 2, 43
Position 2: 3, 236
Position 2: 4, 122
Position 2: 5, 23
Position 2: 6, 70
Position 3: 0, 238
Position 3: 1, 55
Position 3: 2, 225
Position 3: 3, 109
Position 3: 4, 0
Position 3: 5, 3
Position 3: 6, 49
Position 4: 0, 99
Position 4: 1, 253
Position 4: 2, 96
Position 4: 3, 23
Position 4: 4, 20
Position 4: 5, 116
Position 4: 6, 92
Position 5: 0, 169
Position 5: 1, 124
Position 5: 2, 26
Position 5: 3, 3
Position 5: 4, 99
Position 5: 5, 25
Position 5: 6, 247
Position 6: 0, 40
Position 6: 1, 6
Position 6: 2, 14
Position 6: 3, 116
Position 6: 4, 14
Position 6: 5, 178
Position 6: 6, 233
Position 7: 0, 82
Position 7: 1, 18
Position 7: 2, 121
Position 7: 3, 25
Position 7: 

In [8]:
from binascii import unhexlify

# Collected ciphertexts (hex strings)
ciphertexts = [
    "71fe1ace4389087266117cd7c98c4182851b3acff3b086e3f83f94d6eb05c4ba85d8e1fa14f11d1c3b568ff6cff5c09c5d67ef5c9c71b7eeb3d45a5154ab17b83e071ce9d8988adb4afedf46a840",
    "71fe1ace559a1e7266117cd7ce8745d7be2e74c3f0f68eeef57e8884e607debf81dfa0f012f95819681ae7f29fe4839b5175ef5e8760bef0b9d44b504eba12b22f5404f89dd085d550a48865a14f9b15a94dabe609ca2df2cccf210cefdb1af5389719795e1f0179cb77c5c456954d88f3",
    "72fe069c51c81a20775928c7879d4fd2a93c3acff3f69fe5fe2e9493a303d9ea98c4e5b60ae40a146058e7c787fbd09a1474e25dc865b5e6af865d4a40a61bfd384e06e0cfc1ccd356ff8853ac438905fa5fe3fd41cb3bbc8ac9",
    "67e543885b9a5b2267177084cf8453ccb8633ad7fdb39de5b13f8a93a304d6bf8bc4f4ef5def110b6f56a3e186e2c68c1470ef5c9c2ffbd6a291571e40ba1afd3b4b1fe0c4cbccc15df5dc07b043da01fa6ae4fd158f37b3c0cd",
    "71fe029a148c1236320d7192878a59cfbc3a6ec5e7f68befb13196d6ea1ec4ea81d9e3fe50ea0f196d02a2f7cfe2c29c5577e35d8630baf6ea80465b01aa1abc394f57a1f4ccccda59ff8846e44b8805bb5cabe608c231f2dec8364ae7d90ab4358c5c3a421b06",
    "6ef914ce5989152b321a769ad79c42c7be6f6ad2fab19de1fc339d84f04ad3a589dfa0ff09ab0c196f13e7e780b4c097556ded57c871fbeea393464a01aa0ab1381848cfd2d6898918efc046b00b8940bb08e3f313cb23b3dfd8645cfcd80ff82489",
    "71fe1ace4389087266117cd7c4865bd2b93b7fd2b5a58ce9f4308c9ff01e97ab82cbf2ef5dfc101d6a56b3fb8ab4d08b4167ef5c9c30b8f0ab97455b45e81efd364605e49ddb83df48eedc42b60c900fb14db4b229ca74b6c4d96442e1c34df8288f5c3a450a527ecc7c82865b8e",
    "71fe029a148c1437615978d7c58854dbec2c75cde5a39be5e37e9b97ef0697a285dfa0f01cff101d764983f29bf5",
    "71fe1ace50875b31730d6ad7cb8640c7ec3c73d4e1bf81e7b13796d6e518d8a4988ceff05dff101d2415a8fe9fe1d79a4623eb5e8430bfe3b3d442514faf40fd18420be0c8cb89924cf3cd5ee448950efd5cabe500c120f2d9d26440ebc34de029811977430b01748276d79012955cc6a65aebb9054becda5c9278",
    "71fe029a1483123c76597691878459cca9363ac4faf68ceffc2e8d82e61897b98fc5e5f809e20b0c7756b2e08aab83bc5560e257"
]

# Target ciphertext
target_ciphertext = "71fe0680149d083b7c1e3996879a42d0a92e7780f6bf9fe8f42cd898e61cd2b8ccd9f3f35dff101d241da2eacff9cc8d5123fe5a897efbeda4974b"

# Convert hex strings to bytes
ciphertexts = byte_texts[0:7]
target_ciphertext = byte_texts[8]

# XOR two byte strings
def xor_bytes(b1, b2):
    return bytes([x ^ y for x, y in zip(b1, b2)])

# XOR each ciphertext with the target ciphertext
xor_results = [xor_bytes(target_ciphertext, ct) for ct in ciphertexts]

# Analyze XOR results for patterns
for i, result in enumerate(xor_results):
    print(f"XOR result {i+1}: {result}")

# added this function to automate the space.
# Function to find potential spaces in XOR results
def find_spaces(xor_results):
    for i, result in enumerate(xor_results):
        print(f"Analyzing XOR result {i+1}:")
        for j, byte in enumerate(result):
            # Check if the byte corresponds to a lowercase or uppercase letter
            if 0x41 <= byte <= 0x5A or 0x61 <= byte <= 0x7A:
                print(f"  Byte {j}: {chr(byte)} (possible space at this position)")
        print()

# Analyze XOR results for spaces
find_spaces(xor_results)

XOR result 1: b"\x00\x00\x00\x00\x13\x0eSC\x15\x1c\x16\x00\x02\n\x01Ei'I\x1b\x12\x0f\x07\x04I\x08\x02\x00\x0e\x1d\x1c\x1e\x1dT\x0e\nI\x0e\r\x01\x1fC'\x08P\x14\x17\x06\x1bD\x04\x02\x18A\x08\r\x00\x00\x18\x00\x1b\x04WE&E\x17\t\x10S\x03I\x06\r\x12\x18L\x08"
XOR result 2: b'\x00\x00\x00\x00\x05\x1dEC\x15\x1c\x16\x00\x05\x01\x05\x10R\x12\x07\x17\x11I\x0f\tDI\x1eR\x03\x1f\x06\x1b\x19SO\x00O\x06H\x04L\x0fO\x0c\x00\x05T\x01\x17V\x04\x00\x03P\x01\x13\n\x00\t\x01\x01\x15RO7\x16\x0f\x18U\x1b\x0cG\x1cWE;E\x07\x0e\x1bT\x11\x00\x03\t\x0b\r\x00\x15\x1dEL\x04\x18W\x15\x11\x16\x00\x0e\x1d\x14\x00\rI\x01\x12TD\x00\x11NU'
XOR result 3: b'\x03\x00\x1cR\x01OA\x11\x04TB\x10L\x1b\x0f\x15E\x00I\x1b\x12I\x1e\x02O\x19\x02EF\x1b\x01N\x00H\nFW\x1b\x1a\tDMO9\x18\x1a\x07\x00RW\t\x03LU\n\x05\x1cR\x1f\x1b\x0f\t[\x00 \x0c\r\x00\x07\nEA\x1a\x0cE\rH\x0b\x1c\x0b\x07\x03H\x18A\n\x1bNS\x1b'
XOR result 4: b'\x16\x1bYF\x0b\x1d\x00\x13\x14\x1a\x1aS\x04\x02\x13\x0bT_I\x03\x1c\x0c\x1c\x02\x00\x08\x1cEF\x1c\x0e\x1b\x13H\x1b\x1f\