<a href="https://colab.research.google.com/github/Jamesalambert/cryptopals/blob/main/set_3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 1. The CBC padding oracle
This is the best-known attack on modern block-cipher cryptography.

Combine your padding code and your CBC code to write two functions.

The first function should select at random one of the following 10 strings:
```
MDAwMDAwTm93IHRoYXQgdGhlIHBhcnR5IGlzIGp1bXBpbmc=
MDAwMDAxV2l0aCB0aGUgYmFzcyBraWNrZWQgaW4gYW5kIHRoZSBWZWdhJ3MgYXJlIHB1bXBpbic=
MDAwMDAyUXVpY2sgdG8gdGhlIHBvaW50LCB0byB0aGUgcG9pbnQsIG5vIGZha2luZw==
MDAwMDAzQ29va2luZyBNQydzIGxpa2UgYSBwb3VuZCBvZiBiYWNvbg==
MDAwMDA0QnVybmluZyAnZW0sIGlmIHlvdSBhaW4ndCBxdWljayBhbmQgbmltYmxl
MDAwMDA1SSBnbyBjcmF6eSB3aGVuIEkgaGVhciBhIGN5bWJhbA==
MDAwMDA2QW5kIGEgaGlnaCBoYXQgd2l0aCBhIHNvdXBlZCB1cCB0ZW1wbw==
MDAwMDA3SSdtIG9uIGEgcm9sbCwgaXQncyB0aW1lIHRvIGdvIHNvbG8=
MDAwMDA4b2xsaW4nIGluIG15IGZpdmUgcG9pbnQgb2g=
MDAwMDA5aXRoIG15IHJhZy10b3AgZG93biBzbyBteSBoYWlyIGNhbiBibG93
```
... generate a random AES key (which it should save for all future encryptions), pad the string out to the 16-byte AES block size and CBC-encrypt it under that key, providing the caller the ciphertext and IV.

The second function should consume the ciphertext produced by the first function, decrypt it, check its padding, and return true or false depending on whether the padding is valid.

## What you're doing here.
This pair of functions approximates AES-CBC encryption as its deployed serverside in web applications; the second function models the server's consumption of an encrypted session token, as if it was a cookie.

It turns out that it's possible to decrypt the ciphertexts provided by the first function.

The decryption here depends on a side-channel leak by the decryption function. The leak is the error message that the padding is valid or not.

You can find 100 web pages on how this attack works, so I won't re-explain it. What I'll say is this:

The fundamental insight behind this attack is that the byte 01h is valid padding, and occur in 1/256 trials of "randomized" plaintexts produced by decrypting a tampered ciphertext.

02h in isolation is not valid padding.

02h 02h is valid padding, but is much less likely to occur randomly than 01h.

03h 03h 03h is even less likely.

So you can assume that if you corrupt a decryption AND it had valid padding, you know what that padding byte is.

It is easy to get tripped up on the fact that CBC plaintexts are "padded". Padding oracles have nothing to do with the actual padding on a CBC plaintext. It's an attack that targets a specific bit of code that handles decryption. You can mount a padding oracle on any CBC block, whether it's padded or not.

In [None]:
#@title installing cryptography
!pip3 install cryptography --quiet

[K     |████████████████████████████████| 3.6 MB 8.4 MB/s 
[?25h

In [None]:
#@title pkcs7 padding
validPKCSStrings = [b'ICE ICE BABY\x04\x04\x04\x04']
invalidPKCSStrings = [b'ICE ICE BABY\x05\x05\x05\x05', b'ICE ICE BABY\x01\x02\x03\x04']

def pkcs7Padding(data, blockSize):
    overhang = len(data) % blockSize
    paddingLength = blockSize if overhang == 0 else blockSize - overhang
    padding = bytes([paddingLength] * paddingLength)
    return data + padding

def removePadding(data):
    lastByte = int(data[-1])
    endIndex = -1 * lastByte
    if data[endIndex:] == bytes([lastByte] * lastByte):
        return data[:endIndex]
    else:
        raise ValueError("Invalid PKCS7 padding", data)


for candidate in validPKCSStrings + invalidPKCSStrings:
    try:
        print(removePadding(candidate))
    except ValueError as error:
        print(error.args)

b'ICE ICE BABY'
('Invalid PKCS7 padding', b'ICE ICE BABY\x05\x05\x05\x05')
('Invalid PKCS7 padding', b'ICE ICE BABY\x01\x02\x03\x04')


In [None]:
#@title helper functions
import random

def xorBytes(in1, in2):
  xor = bytes([x ^ y for (x,y) in zip(in1, in2)])
  return xor

def reshape(data, keySize):
  dataSize = len(data)

  out = []
  data = pkcs7Padding(data, keySize)

  startIndex = 0
  endIndex = keySize

  for i in range(0, dataSize, keySize):
    out += [data[startIndex : endIndex]]
    startIndex += keySize
    endIndex += keySize
  return out

def flatten(byteArray):
    flatData = b''
    for byte in byteArray:
        flatData += byte
    return flatData

def randomBytes(length):
    return bytes([random.randint(0,255) for i in range(length)])


In [None]:
#@title AES - ECB
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes

testString = b"hbdft3fsbnhlngbe"
testKey = b"yellow submarine"

def encryptAES_ECB(data, key):
    # paddedData = pkcs7Padding(data, len(key))

    AESCipher = Cipher(algorithms.AES(key), modes.ECB())
    encryptor = AESCipher.encryptor()

    ciphertextBytes = encryptor.update(data) + encryptor.finalize()
    return ciphertextBytes

def decryptAES_ECB(data, key):

    # paddedData = pkcs7Padding(data, len(key))

    AESCipher = Cipher(algorithms.AES(key), modes.ECB())
    decryptor = AESCipher.decryptor()

    plaintextBytes = decryptor.update(data) + decryptor.finalize()
    return plaintextBytes

paddedPT = pkcs7Padding(testString, len(testKey))
ct = encryptAES_ECB(paddedPT, testKey)
recoveredPlaintext = decryptAES_ECB(ct, testKey)

assert removePadding(recoveredPlaintext) == testString
# print(f"in: {testString}\nct: {ct}\npt: {recoveredPlaintext}")

In [None]:
#@title AES - CBC
testString = b"9AuhqAjjiKLU&>\rY5LtmCn`\rb\x0bU ga&,jg\r\x0bJv|6h1%7Q'w\x0c0gLRe\x0c^jzO.6UhPp$|fij6D\rk\x0c(rvH\x0c'pe3VPnQn[\nyJz\\3-nYJFJ(A[CxGt2~Fq\\ V\n&=#>[hTt9EM>#O!,mX*jF0D%r{PL$6yB:}PZ+]4#hN3}"
testKey = b"YELLOW SUBMARINE"
testIV = b"0000000000000000"

def encryptAES_CBC(data, key, iv):
    acc = iv
    shapedData = reshape(data, len(key))
    shapedAcc = reshape(acc, len(key))
    # print(f"Encrypting blocks = {len(shapedData)}")
    for block in shapedData:
        newBlock = xorBytes(block, shapedAcc[-1])
        # print(f"new_block ({len(newBlock)}): {newBlock}")
        newCTBlock = encryptAES_ECB(newBlock, key)
        # print(f"new_ct ({len(newCTBlock)}): {newCTBlock}")
        shapedAcc.append(newCTBlock)
    return flatten(shapedAcc)


def decryptAES_CBC(data, key):
    shapedCt = reshape(data, len(key))
    shapedPt = []
    for blockIndex in range(len(shapedCt) - 1):
        decryptedBlock = decryptAES_ECB(shapedCt[blockIndex + 1], key)
        newPtBlock = xorBytes(decryptedBlock, shapedCt[blockIndex] )
        shapedPt.append(newPtBlock)
    return flatten(shapedPt)

#testing
ciphertext = encryptAES_CBC(testString, testKey, testIV)
plaintext = decryptAES_CBC(ciphertext,testKey)
assert plaintext == testString

In [None]:
#@title ex-1 data + key
testStrings = [
'MDAwMDAwTm93IHRoYXQgdGhlIHBhcnR5IGlzIGp1bXBpbmc=',
'MDAwMDAxV2l0aCB0aGUgYmFzcyBraWNrZWQgaW4gYW5kIHRoZSBWZWdhJ3MgYXJlIHB1bXBpbic=',
'MDAwMDAyUXVpY2sgdG8gdGhlIHBvaW50LCB0byB0aGUgcG9pbnQsIG5vIGZha2luZw==',
'MDAwMDAzQ29va2luZyBNQydzIGxpa2UgYSBwb3VuZCBvZiBiYWNvbg==',
'MDAwMDA0QnVybmluZyAnZW0sIGlmIHlvdSBhaW4ndCBxdWljayBhbmQgbmltYmxl',
'MDAwMDA1SSBnbyBjcmF6eSB3aGVuIEkgaGVhciBhIGN5bWJhbA==',
'MDAwMDA2QW5kIGEgaGlnaCBoYXQgd2l0aCBhIHNvdXBlZCB1cCB0ZW1wbw==',
'MDAwMDA3SSdtIG9uIGEgcm9sbCwgaXQncyB0aW1lIHRvIGdvIHNvbG8=',
'MDAwMDA4b2xsaW4nIGluIG15IGZpdmUgcG9pbnQgb2g=',
'MDAwMDA5aXRoIG15IHJhZy10b3AgZG93biBzbyBteSBoYWlyIGNhbiBibG93',
]
ex1Key = b'\xf7Et\x82\x10\xc7\x01.*_\xe1\x1eY\xff$\xc5'

In [None]:
def encryptRandomlyChosenString():
    string = random.choice(testStrings)
    pt = pkcs7Padding(bytes(string, 'ascii'), 16)
    iv = randomBytes(16)
    key = ex1Key
    return (encryptAES_CBC(pt, key, iv), iv)

def decryptAndCheckPadding(data):
    recoveredPT = decryptAES_CBC(data, ex1Key)
    try:
        removePadding(recoveredPT)
    except ValueError:
        return False
    return True

ct, iv = encryptRandomlyChosenString()
assert decryptAndCheckPadding(ct)

In [None]:
#Any byte string ending with 1 is valid pkcs7
removePadding(randomBytes(5) + bytes([1]))

b'\xb3\x0e\x05\xc6$'

In [None]:
ct = bytearray(encryptRandomlyChosenString()[0])