**[✏️ replace this text with your name(s) and your ID number(s)]**

# Homework 3

*Due date:* March 6, 2025 (Thursday) at **11:59 PM** (note the non-standard time!)

The premise behind [confessions pages](https://en.wikipedia.org/wiki/Confessions_page) is pretty simple.
Students can anonymously submit their confessions and secrets, and then the admins decide which confessions to post on the page for everyone to view.
While anonymity has empowered many people into revealing their thoughts online, this can possibly make confessions pages very, *very* toxic!

After yet another Freedom Wall shut down just recently, some CS students decided that they would make a new and better Freedom Wall.
A very ambitious goal if I do say so myself.
It is currently hosted at http://hw3.lunchtimeattack.wtf.

Now's the time to bring out the big guns. In this homework, you will carry out one of the best-known attacks in modern cryptography called the *padding oracle attack*.
More specifically, your task is to use the Freedom Wall site as a padding oracle to decrypt the messages.

This homework has 30 points in total. Final percentages are capped at 100%.

Please be guided on the policies regarding late submissions, regrading, and collaboration.
If any, please direct all your questions and clarifications about this homework in the `#hw3-help` channel on 
the Discord server.

## Some reminders

You are not allowed to use additional third-party libraries other than those explicitly used here, though libraries within the Python standard library are fair game.

**Very important:** Always work with raw bytes, never with encoded strings.

## Some background

A *padding oracle* is an oracle (think of it like a black box or an API) that can tell whether or not the padding in a CBC-encrypted ciphertext is valid. 
In practice, a padding oracle can be found in a service on a remote host sending error messages whenever it receives malformed ciphertexts.

In a client-server scenario where the server is supposed to decrypt the message but keep the message content secret, a padding oracle vulnerability exists if the server indicates to the client that the padding was invalid. In this case, it is possible for an attacker to interactively query the server with manipulated copies of the ciphertext until the padding error does not occur, allowing them to determine, byte-by-byte, what the plaintext contains **without knowing the key!** In effect, what we're doing is a *side-channel attack*, since this attack targets the implementation of a computer system (the invalid padding error messages), rather than weaknesses in the implemented algorithm itself (AES-CBC).

Therefore padding oracle attacks keep track of inputs that have a valid padding and those that don't, and exploit this information to decrypt chosen ciphertext values, so it is an example of a *chosen-ciphertext attack* (CCA).

The padding oracle attack was first described in a 2002 paper by Serge Vaudenay: [*On Security Flaws Induced by CBC Padding
Applications to SSL, IPSEC, WTLS...*](https://www.iacr.org/archive/eurocrypt2002/23320530/cbc02_e02d.pdf)

## Getting started

First, go to the Freedom Wall site at http://hw3.lunchtimeattack.wtf/. It should look like this:

![fwsite](https://i.imgur.com/9ICCYUt.png)

The site has a form that allows users to test whether messages can be properly decrypted. The form uses JavaScript to make an HTTP GET request to an API endpoint `/api/verify`. You can verify this by opening Developer Tools (press F12 to open it) and make the request.

For example, if I enter `1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef` into the input box and then click "Verify", it makes a GET request to this URL: `http://hw3.lunchtimeattack.wtf/api/verify?message=1234567890abcdef1234567890abcdef1234567890abcdef1234567890abcdef`.
In response, it returns a JSON object containing a boolean value whether or not the message is valid and a string for any error returned.

![devtools](https://i.imgur.com/U04lVtY.png)

For the morbidly curious, you can see the Python server code for this website here: https://gist.github.com/alltootechnical/6954863e92c49d780d431b633bc6d0ec

## 3-1. Interacting with the `verify` API [4 pts]

To efficiently carry out this attack, you don't want to manually copy and paste hex-encoded bytes!
Since we have access to the `/api/verify` endpoint, why not use that instead?

We will use the `requests` library to make GET requests.

In [1]:
import requests

In [2]:
req = requests.get('http://hw3.lunchtimeattack.wtf/api/verify?message=b75af2483be584bdba53b47694f3bf9a36d4ebe30a22174a57a526925babcd31')
req

<Response [200]>

To get the actual JSON response, we use `req.json()`. This outputs a dictionary, for which we can access the value of `key1` using `response['key1']`.

In [3]:
response = req.json()
response

{'reason': 'Invalid padding', 'result': False}

In [4]:
print(f"Is it valid? {response['result']}")
print(f"Why? {response['reason']}")

Is it valid? False
Why? Invalid padding


Now write a function called `padding_oracle` that makes a GET request to the `verify` API, given a hex-encoded string, and then outputs the JSON response.

In [None]:
def padding_oracle(hexstring):
    # to-do

In [None]:
# if you did everything correctly, there should be no errors when running this cell
assert padding_oracle('148ba22d8923171fd4b70422acc45e06a6aee1e2882e2e9c636b4b07005e6e9f') == {'reason': 'Invalid padding', 'result': False}
assert padding_oracle('148ba22d8923171fd4b70422acc45e06a6aee1e2882e2e9c636b4b07005e6e9e') == {'result': True}

## 3-2. Decrypting the last byte [6 pts]

Let's start simple. For the next three parts including this one, we'll deal with this ciphertext to practice on:
```
7f58c8d06944c9504541f70bcc34bed37f7a2156c16ea20cb2b9b9822bf9d301e89eb96c26fe64bcc8526d43bd1edbbfe12a54f7ec4314d7052bf313807dacd0
```

This is also encrypted with the same key as the encrypted messages on the website.
Our goal for this part is to recover the last byte of the first plaintext block.

In [None]:
test_ctxt = unhexlify(b'7f58c8d06944c9504541f70bcc34bed37f7a2156c16ea20cb2b9b9822bf9d301e89eb96c26fe64bcc8526d43bd1edbbfe12a54f7ec4314d7052bf313807dacd0')

A key property about CBC mode is that it is [*malleable*](https://en.wikipedia.org/wiki/Malleability_(cryptography)), meaning it's possible to transform a ciphertext into another ciphertext which decrypts to a related plaintext.
To be more precise, CBC mode is partly malleable since flipping a bit in a ciphertext block will completely mangle the plaintext it decrypts to, but will result in the same bit being flipped in the plaintext of the next block.

Suppose we want to decrypt the ciphertext block `C2`, like the one shown below. We're looking for some value `X`, which is the result of `Dec_k(C2)`, and `P2`, the block obtained after decrypting in CBC mode. If we pick a random block `C1` and send the two-block ciphertext `C1 || C2` to the oracle, decryption will only succeed if `C1 || C2` ends with a valid padding.

![cbcdec](https://i.imgur.com/Nle7ZtP.png)

So to decrypt `C2[15]`, we "force" the resulting plaintext to have a valid padding (`01` in this case) by taking `C1` and changing its last byte until decryption succeeds.
We can do this with a for loop for the last byte `b` from `0x00` to `0xff` (255), and we'll call this modified version as `C1'`:
```
C1' = C1[:15] || bytes([b])
```
We'll call the result of XORing the decryption of `C2` and the modified ciphertext block `C1'` as `P2'`, where the last byte is 
```
P2'[15] = X[15] ^ C1'[15]
```
If the decryption of `C1' || C2` succeeds, then we know that `P2'[15] = X[15] ^ C1'[15] = 0x01`. Now what does that tell us?

Since `C2` is the result of `Enc_k(P2 ^ C1)`, we have:
```
P2'[15] = X[15] ^ C1'[15]
        = Dec_k(C2)[15] ^ C1'[15]
        = Dec_k(Enc_k(P2 ^ C1))[15] ^ C1'[15]
        = (P2 ^ C1)[15] ^ C1'[15]
        = P2[15] ^ C1[15] ^ C1'[15]
```
and we have shown that `P2'[15] = 0x01`, so:
```
P2[15] ^ C1[15] ^ C1'[15] = 0x01
```
which we can rearrange to get
```
P2[15] = C1[15] ^ C1'[15] ^ 0x01
```

Using all of these steps, write a function called `decrypt_last_byte` that outputs the last byte of the plaintext. Use your `padding_oracle` function to check if the decryption is successful. Use the `split_blocks` function provided below to split your ciphertext input into blocks of 16 bytes. Use a `bytearray` object instead of `bytes` so that you can modify bytes. Use the `.hex()` method to convert a hex-encoded bytestring into an actual string for the `padding_oracle` function.

**⚠️ Important:** You should get this step correctly first before proceeding to the next steps.

In [None]:
def split_blocks(data, blksz=16):
    blocks = []
    for i in range(len(data) // blksz):
        blocks += [data[i*blksz:(i+1)*blksz]]
    return blocks

In [None]:
def decrypt_last_byte(ctxt):
    blocks = split_blocks(ctxt)
    # TO-DO

The last byte should decrypt to the space character `b' '` or 32 in decimal.

In [None]:
# if you did everything correctly, there should be no errors when running this cell
assert decrypt_last_byte(test_ctxt) == ord(' ')

## 3-3. Decrypting another byte [4 pts]

For the next byte, we have to choose a byte for `C1[14]`. At this point we already know what `P2[15]` is.
Recall the equation we use to get `P2[15]`:
```
P2[15] = C1[15] ^ C1'[15] ^ 0x01
```
We can "solve" for `C1'[15]` by rearranging it to get:
```
C1'[15] = P2[15] ^ C1[15] ^ 0x02
```
The padding byte `0x02` changes here since we moved from last byte to second to last, so now we want to "force" the resulting plaintext to end in `0202`.

We take `C1` and change its second to last byte `C1[14]` until decryption succeeds.
We can do this with a for loop for the second to last byte `b` from `0x00` to `0xff` (255), and we'll call this modified version as `C1'`:
```
C1' = C1[:14] || bytes([b]) || bytes([P2[15] ^ C1[15] ^ 0x02])
                       ^^^           ^^^^^^^^^^^^^^^^^^^^^^^^  
                     C1'[14]                  C1'[15]
```
If the decryption of `C1' || C2` succeeds, then we know that `P2'[14] = X[14] ^ C1'[14] = P2[14] ^ C1[14] ^ C1'[14] = 0x02`. We have:
```
P2[14] ^ C1[14] ^ C1'[14] = 0x02
```
which we can rearrange to get
```
P2[14] = C1[14] ^ C1'[14] ^ 0x02
```

Take your entire `decrypt_last_byte` function from last time and modify it as follows:
* At the beginning (before the first for loop), provide a variable, say `last_two_bytes`, that will
  store the last two bytes it finds.
* Make a copy of the for loop (yes, copy–paste that thing), since the second for loop will deal with finding
  the second to last byte. 
* In the first for loop, instead of returning the found value `P2[15]`, append it to the `last_two_bytes` variable.
* In the second for loop, make the necessary changes according to the steps above. Like in the previous one, 
  instead of returning the found value `P2[14]`, append it to the `last_two_bytes` variable.
* Your function should then return `last_two_bytes`.
* Finally, rename your modified function as `decrypt_last_two_bytes`.

In [None]:
def decrypt_last_two_bytes(ctxt):
    # TO-DO

Now, you should have the last two bytes of the plaintext `b' d'`. This is actually reversed, but we'll fix that in the next step.

In [None]:
# if you did everything correctly, there should be no errors when running this cell
assert decrypt_last_two_bytes(test_ctxt) == b' d'

## 3-4. Decrypting a whole block [6 pts]

By now, you may (or may not) have noticed a pattern:
```
# decrypt last byte
C1'[15] = b
P2[15] = C1[15] ^ C1'[15] ^ 0x01

# decrypt 2nd to last byte
C1'[14] = b
C1'[15] = P2[15] ^ C1[15] ^ 0x02
P2[14] = C1[14] ^ C1'[14] ^ 0x02

# decrypt 3rd to last byte
C1'[13] = b
C1'[14] = P2[14] ^ C1[14] ^ 0x03
C1'[15] = P2[15] ^ C1[15] ^ 0x03
P2[13] = C1[13] ^ C1'[13] ^ 0x03

... and so on
```
So we can generalize it like this (in pseudocode):
```
P = bytearray([0, ..., 0])
              ^^^^^^^^^^^
               16 times
for each i from 0 to 15:    # for each byte position
    # calculate the padding
    pad_byte := i+1
    padding := bytearray([0, ..., 0] || [pad_byte, ..., pad_byte])
                         ^^^^^^^^^^^    ^^^^^^^^^^^^^^^^^^^^^^^^^
                         15-i times              i+1 times
    C1' := C1 ^ P ^ padding
    
    for each byte b from 0x00 to 0xff:
        # set the (15-i) byte to b
        C1'[15-i] := b
        # assemble the ciphertext to be tested
        modified := C1' || C2
        
        if the oracle says that decryption is successful:
            # compute the plaintext byte
            P[15-i] := b ^ C1[15-i] ^ pad_byte
            break
            
remove the padding characters (i.e., 0x00 to 0x10) from P
return P
```

Write a function called `decrypt_block` that does what's described in the pseudocode.
You already have your `xor_bytes` function from Homework 1 to XOR two bytestrings, so use that. Use `bytearray`s so that you can manipulate bytes.

*Note:* Partial points may be given if you can only partially decrypt the ciphertext.

In [None]:
def decrypt_block(ctxt):
    c1, c2 = ctxt
    # TO-DO

Eventually, you'll find out that the IV and first ciphertext block decrypts to the first plaintext block containing the string `b'if you can read '` (note the last character is a space).

In [None]:
# if you did everything correctly, there should be no errors when running this cell
assert decrypt_block(split_blocks(test_ctxt)[0:2]) == b'if you can read '

## 3-5. Decrypting multiple blocks [4 pts]

Now that we can decrypt a whole block, we can do a for loop and use `decrypt_block` as a subroutine feeding in two blocks at a time, like so:
```
blocks := split_blocks(ctxt)
plaintext := b''
for each block i from 0 to len(blocks)-2:   # ranges are inclusive
    decrypted_block := decrypt_block(blocks[i:i+2])
    append decrypted_block to plaintext
return plaintext
```

Write a function called `decrypt_message` to do this.

**⚠️ Important:** The speed of decryption may depend on your Internet connection and the server load, so expect some slowdowns when you make HTTP requests to the website when the server gets congested.
For reference, the test ciphertext takes about 20 minutes to decrypt.

In [None]:
def decrypt_message(ctxt):
    # TO-DO

In [None]:
# if you did everything correctly, there should be no errors when running this cell
assert decrypt_message(test_ctxt) == b'if you can read this message, congrats!'

## 3-6. The main event [6 pts]

Finally, go to the Freedom Wall website and decrypt the three encrypted messages posted there using your `decrypt_message` function. Use `unhexlify` to convert hex-encoded bytes to raw bytes.

**⚠️ Important:** The speed of decryption may depend on your Internet connection and the server load, so expect some slowdowns when you make HTTP requests to the website when the server gets congested.
For reference, Posts #2 and #3 take about 25 to 30 minutes to decrypt, while Post #1 takes about an hour and a half.

*Note:* Partial points may be given if you can only partially decrypt the messages. I highly recommend to print the result after each block is decrypted.