Block cipher algorithms (like AES) require input data to have a length that is multiple of the block size. In the case of AES, which is the focus of this write-up, 16 bytes. If the plaintext doesn't follow this requirement, padding must be added to adjust its length.
This attack relies on having a "padding oracle" who answers if a message was correctly padded or not. This information can be obtained directly, or through side-channel attacks.
With that information, a ciphertext can be decrypted without ever knowing the key, except for the first block (the first 16 bytes), which requires the Initialization Vector (IV) used during encryption.
Kudos to flashs101 for inspiring this write-up is. Much of what follows is thanks to his work
Symmetric-key algorithms are cryptographic algorithms that use the same secret key for encryption and decryption operations. In contrast, asymmetric-key (or public-key) algorithms use two keys: a public (known) key to encrypt the messages and a private key to decrypt them.
The requirement that both parties have access to the secret key is one of the main drawbacks of symmetric-key encryption. That's why public-key algorithms are often used to exchange this secret.
Symmetric-key encryption can use either stream or block ciphers:
- Stream ciphers. Encrypts data as a continuous stream of bits or bytes, one at a time, using a pseudorandom keystream (a sequence of bytes or bits generated by the algorithm using a secret key).
- Block ciphers. Encrypts data in fixed-size blocks, padding the plaintext to achieve a multiple of the block size.
We'll focus on the latter.
The following are some of the basic concepts required to understand this write-up:
- Plaintext: Any readable information presented in a format that is accessible and usable without any decryption, including binary data.
- Ciphertext: The encrypted version of the plaintext.
- Cipher: A cryptographic algorithm used to apply encryption or decryption operations to data.
- Encryption: Turns plaintext into ciphertext by using a cipher and a secret key.
- Decryption: Turns ciphertext into plaintext.
First, install the dependencies. You can do that in a venv:
python3 -m venv .venv
source $(pwd)/.venv/bin/activate # Activates the venv
python3 -r src/requirements.txtOnce installed, you can run the script:
python3 src/main.pyAvailable params:
-p PLAINTEXT,--plaintext PLAINTEXT. Plaintext to encrypt using AES-CBC mode (base64-encoded). If not provided, sample text is used.-c CIPHERTEXT,--ciphertext CIPHERTEXT. AES-CBC ciphertext to decrypt (base64-encoded). If not provided, sample text is used.-i IV,--iv IV. Vector for decrypting the first block (base64-encoded).--use-native-padder. Usecryptography's implementation of the PKCS#7 padder.
If neither the plaintext or ciphertext are provided, the following plaintext will be used:
The truth may be stretched thin, but it never breaks, and it always surfaces above lies, as oils floats on water.
Note that all input strings must be Base64 encoded. As an example, this is how you'd encode the IV:
import base64
base64.b64encode(b'\x12\x87\xff\xaf\xe9kF\xd0\x89\xe0T\x9a\xfb5\xa5{').decode()
# Output: 'Eof/r+lrRtCJ4FSa+zWlew=='To use it:
python3 src/main.py -i 'Eof/r+lrRtCJ4FSa+zWlew=='The repo is structured as follows:
src/main.py: Contains the attack logic.oracle.py: Simulates the server's oracle.pkcs7.py: My own implementation of the PKCS7 padding algorithm.settings.py: Configuration values.requirements.txt
The following values are set in src/settings.py.
BLOCK_SIZE: AES block size in bytes.SAMPLE_TEXT: Default plaintext used when no plaintext or ciphertext is given as input.KEY: Default encryption key.IV: Default IV.
Change them as you see fit.
A block cipher by itself can only securely transform fixed-size blocks of data. To encrypt messages of arbitrary length, we need to use a mode of operation, which defines how to process multiple blocks and handle padding.
Most modes of operations use an Initialization Vector (IV), a unique binary sequence added to the encryption process. The IV adds randomness, ensuring that encrypting the same plaintext with the same key will produce different ciphertexts. This is important to avoid attackers from spotting patterns, making it easier to guess the original message, or find weaknesses in the encryption.
There are many modes of operations, but the most common (for real) are these:
- GCM (Galois/Counter Mode)
- CCM (Counter with CBC-MAC)
- SIV (Synthetic IV)
- GCM-SIV
- CBC (Cipher Block Chaining)
Modes like ECB (Electronic Codebook) are avoided because they are vulnerable to pattern analysis and other attacks. Check this section on the Block cipher's wikipedia page to see how insecure it is.
The padding oracle attack is primarily based on the AES-CBC mode. By observing how a system responds to different ciphertext (the padding is valid or not) the attacker (us) will be able to gradually recover the plaintext without knowing the secret key.
Let's see how this mode of operation works.
During encryption, each plaintext block is XORED with the previous ciphertext block (or the IV for the first block), and then encrypted with the secret key.
Since each ciphertext block is dependent on the previous one, encryption cannot be parallelized.
To decrypt, we simply reverse the process: first, decrypt each ciphertext block using the secret key, then XOR the result with the previous ciphertext block (or the IV for the first block). The recovers the original plaintext block.
These operations can be represented as mathematical operations. Don't worry, they are fairly simple, and are only included here because they are needed for this attack. If not, I wouldn't bother.
For encryption:
where:
-
$C_i$ is the ciphertext block -
$E_K$ is the AES-CBC encryption operation -
$P_i$ is the plaintext block -
$C_{i-1}$ is the previous ciphertext block, or$IV$ for the first block -
$\oplus$ means XOR
And the decryption formula is:
We'll come to them in a second.
One very important thing to note: XOR works on individual bits (and by extension, bytes). This means that if you change one bit (or byte), only that specific part is affect, nothing else changes.
Here's where the magic resides.
Many AES-CBC implementations use the PKCS#7 algorithm for padding, a simple yet elegant solution to this problem.
This algorithm works in a very simple way. If N is the numbers of bytes in a block and M bytes (M < N) are missing from the last block, it adds the character 0xM (hexadecimal) M times at the end of the block.
To simplify things, let's say we are working with blocks of 4 bytes, and we want to encrypt the following plaintext:
p = 'PURPLE'
Since our plaintext is 6 bytes long and the block size is 4, PKCS#7 needs to add 2 extra bytes to fill up the last block, like so:
p_padded = 'PURPLE\0x02\0x02'
If 3 extra bytes were needed it would add three 0x03, for 4, 4 of 0x04, ...
One important thing to note is that PKCS#7 always add padding, even if the plaintext's length is already a multiple of the block size.
Our plaintext now looks like this:
p = 'HOLA'
The plaintext is 4 bytes long, which is a multiple of 4, but because it always adds padding, a new block of 0x04 characters will be appended at the end:
p_padded = 'HOLA\0x04\0x04\0x04\0x04'
This quote from the simpsons illustrates WAY to well how the whole attack works. You'll get the why in a moment.
Now that we have the theoretical base in place, let's see how we can exploit the Oracle for our own benefit. To make things simpler, we are going to work with a small example using just 4 blocks. Meaning, we'll have 4 plaintext blocks and 4 ciphertext blocks. The block size will still be 16, we are on AES after all.
Let's illustrate this. For encryption we have:
And decryption:
Let's take another look at the (AES-CBC) decryption formula:
Now, let's take X. This is a block that we control and can change as we please. This is the base for the whole attack.
Expressing this with the previous formulas, we get:
$C_3 = E_K(P_3 \oplus C_2)$ $P^{\prime}_1 = D_K(C_3) \oplus X$
If we replace
Because decryption (
And we've gotten to a point where we have 2 knowns and 2 unknowns.
Knowns:
-
$C_2$ : Previous ciphertext block. -
$X$ : Ciphertext block we control.
Unknowns:
-
$P_3$ : Last plaintext block. -
$P^{\prime}_1$ : Last plaintext block from decrypting$X || C_3$ ($||$ means concatenation).
XOR works on individual bits (and by extension, bytes). This means that if you change one bit (or byte), only that specific part is affected. And this is a very important concept for us.
Let's suppose we have the following sequence of numbers:
We want to XOR this sequence so that only the number 27 is affected. If we apply a XOR between the binary
We can go a step further and apply the
Because XOR is applied on individual bytes (or bits), we can break down the previous
$P_{3[0]} = P^{\prime}_{1[0]} \oplus C_{2[0]} \oplus X{[0]}$ $P_{3[1]} = P^{\prime}_{1[1]} \oplus C_{2[1]} \oplus X{[1]}$ $...$ $P_{3[15]} = P^{\prime}_{1[15]} \oplus C_{2[15]} \oplus X{[15]}$
We go up to 15 since the block size is 16 and we start counting at 0.
To get the last plaintext byte we need to resolve this formula (the one we just saw):
If we recall the explanation we saw earlier, we have two knowns and two unknowns, so let's focus on the one unknown we can get:
The idea for the first iteration is to change the last byte of
This is really interesting: if we can force the last plaintext block to have a specific padding (0x01 for the first iteration), we can get these last plaintext bytes by just applying the
We can literally force the server to reveal the plaintext WITHOUT EVER KNOWING THE KEY!
Math baby, MATH.
For this first iteration we want to force the plaintext to have 0x01 padding (
Because XOR works on bytes, we can simple try every possible value (0 to 255) for
To make sure we're not accidentally hitting any existing padding, we also need to change the previous byte of 0x01.
Once we know which value makes
where:
-
$P^{\prime}_{1[15]}$ :0x01(the value we forced by changing$X_{[15]}$ ). -
$C_{2[15]}$ : The corresponding byte from the previous ciphertext block. -
$X_{[15]}$ : The value (0 to 255) that made$P^{\prime}_{1[15]}$ equal to0x01
There are two formulas involved now:
$P_{3[15]} = P^{\prime}_{1[15]} \oplus C_{2[15]} \oplus X{[15]}$ $P_{3[14]} = P^{\prime}_{1[14]} \oplus C_{2[14]} \oplus X{[14]}$
We want to force
However, we can't use the 0x01 padding, not 0x02! But we do have
where:
-
$P^{\prime}_{1[15]}$ : Is0x02 -
$C_{2[15]}$ : The corresponding byte from the previous ciphertext block. -
$P_{3[15]}$ : Is the plaintext byte we calculated in the previous section.
With that in mind, we need to:
- Calculate and update
$X{[15]}$ based on the previous formula. - Scramble
$X{[13]}$ (the byte before$X{[14]}$ ) to make sure the only possible padding is0x02. - Try every possible value (0 to 255) for
$X{[14]}$ until the oracle says the padding is valid.
To calculate any given plaintext byte (
For each position (byte), we calculate the right value for
Since plaintext bytes are calculated per-block, this process can be parallelized.
The following pseudocode illustrates this process:
For each block (except the first one):
For j from 15 down to 0:
- Set the padding value to (16 - j): 0x01, 0x02, ...
- For all positions k>j, set X[k] so that P'[k] = padding (using formula1)
- Try all possible values for X[j] (0 to 255) until the oracle says the padding is valid
- Once found, calculate the plaintext byte using formula2
where:
- Formula1:
$X{[k]} = padding \oplus C_{i-1[k]} \oplus P_{i[k]}$ - Formula2:
$P_{i[j]} = padding \oplus C_{i-1[j]} \oplus X{[j]}$









