# Description

For this exercise you will perform a few steps, in several functions, that will utilize the modules you learned about in this lesson, with minor background from other parts of this course

As part of these steps, you will perform informal "profiling" of each operation by measuring the time it takes.  It is difficult to validate these times in the unit tests, since operations can take significantly more or less time depending on system factors outside your control (or my control as author).  Please verify yourself that your reported times seem reasonable.

In cryptography, there is a concept of a "one-time pad" where a message is combined, byte by byte, with a shared random key that is as long as the message.  That is the general idea of these steps.  This exercise is **not** an complete cryptosystem, so do not use it as such.

If you are not familiar with XOR'ing bytes, here is a simple example to follow:

```python
>>> mess = "My message".encode()
>>> key  = "secretsecretsecret".encode()  # terrible key
>>> encrypted = b''.join(chr(m^k).encode() for m, k in zip(mess, key))
>>> encrypted
b'>\x1cC\x1f\x00\x07\x00\x04\x04\x17'
```

Reversing it is simply XOR'ing the encrypted version with the same key segment:

```python
>>> b''.join(chr(e^k).encode() for e, k in zip(encrypted, key))
b'My message'
```

We will use the three nursery rhymes used in this lesson as plaintext examples.  You will need to:

* Generate on million bytes of random key material
* Save that key in a temporary file with a name
* Time how long it takes to generate the key and print the number to screen
* Perform OTP encryption of each nursery rhyme, saving the encrypted version to disk with the extra extension `.enc`.  E.g. Save `newberry.txt.enc` but do not remove `newberry.txt`.
* Use a distinct, non-overlapping segment of the key material for each encryption
* Time the encryption operations

# Setup

In [1]:
# No need to modify this, feel free to utilize it
def encrypt(plaintext: str, key: bytes) -> bytes:
    "Fully working encryption (no key schedule or text handling)"
    plainbytes = plaintext.encode('ascii')
    return b''.join(bytes([m^k]) for m, k in zip(plainbytes, key))

def decrypt(crypttext: bytes, key: bytes) -> str:
    "Fully working decryption (no key schedule or text handling)"
    return ''.join(chr(m^k) for m, k in zip(crypttext, key))

In [2]:
def create_OTP():
    "Create temporary file with OTP and return its name"
    keyfile = '/dev/zero'
    # ... write a million random bytes to keyfile
    # Report time of operation
    print("Key generation took 1.234 seconds")
    return keyfile

keyfile = create_OTP()

# Might want to generalize this to find all text files
sources = ['carey.txt', 'king.txt', 'newberry.txt']

# FIXME: non-overlapping OTP key needed for each file
offsets = [0, 100, 200]

def encrypt_file(fname: str, keyfile: str, offset: int):
    "Encrypt a file"
    # ... Read the plaintext
    # ... Perform encryption
    # ... Write encrypted version to "fname.enc"
    print(f"Encrypted file {fname}.enc in 2.345 seconds")
    
for fname, offset in zip(sources, offsets):
    encrypt_file(fname, keyfile, offset)

Key generation took 1.234 seconds
Encrypted file carey.txt.enc in 2.345 seconds
Encrypted file king.txt.enc in 2.345 seconds
Encrypted file newberry.txt.enc in 2.345 seconds


# Solution

In [3]:
import secrets
import tempfile
from time import time
from pathlib import Path

def create_OTP():
    "Create temporary file with OTP and return its name"
    start = time()
    keyfile = tempfile.NamedTemporaryFile(delete=False)
    # ... write a million random bytes to keyfile
    keyfile.write(secrets.token_bytes(1_000_000))
    # Report time of operation
    print(f"Key generation took {time()-start:0.3f} seconds")
    return keyfile.name

keyfile = create_OTP()
sources = list(Path('.').glob('*.txt'))

offset, offsets = 0, []
for source in sources:
    offsets.append(offset)
    offset += source.stat().st_size
offsets

def encrypt_file(fname: str, keyfile: str, offset: int):
    "Encrypt a file"
    start = time()
    plaintext = Path(fname).read_text()
    with open(keyfile, 'rb') as fh:
        fh.seek(offset)
        key = fh.read(len(plaintext))
    encrypted = encrypt(plaintext, key)
    Path(f"{fname}.enc").write_bytes(encrypted)
    print(f"Encrypted file {fname}.enc in {time()-start:0.5f} seconds")

for fname, offset in zip(sources, offsets):
    encrypt_file(fname, keyfile, offset)

Key generation took 0.048 seconds
Encrypted file carey.txt.enc in 0.00271 seconds
Encrypted file king.txt.enc in 0.00580 seconds
Encrypted file newberry.txt.enc in 0.00167 seconds


# Test Cases

In [4]:
def test_keyfile():
    from pathlib import Path
    assert Path(keyfile).stat().st_size == 1_000_000
    
test_keyfile()

In [5]:
def test_offsets():
    # Assure distinct key material
    from pathlib import Path
    
    assert all(0 <= offset < 1_000_000 for offset in offsets)
    
    offset = offsets[0]
    for fname, size in zip(sources, offsets[1:]):
        assert Path(fname).stat().st_size <= size, \
            f"Not enough key bytes for {fname}"
        offset = size
        
test_offsets()

In [6]:
def test_encryption():
    from pathlib import Path
    for fname, offset in zip(sources, offsets):
        with open(keyfile, 'rb') as fh:
            fh.seek(offset)
            key = fh.read(10_000)  # Assume small plaintext
        plaintext = Path(fname).read_text()
        decrypted = decrypt(Path(f"{fname}.enc").read_bytes(), key)
        assert plaintext == decrypted, f"Error in {fname}.enc"
        
test_encryption()