# A write-up for Google CTF 2020
-   Challenge: **ORACLE**
-   Category: *Cryptography*
-   By: [ndh](https://www.github.com/nguyenduyhieukma)

## 1. Challenge summary

We are asked to perform chosen plaintext/ciphertext attacks on the [AEGIS](https://eprint.iacr.org/2013/695.pdf) cipher with key-IV pair reuse. To make the challenge harder, the number of oracle calls (i.e. the number of plaintexts/ciphertexts) is limited.

## 2. AEGIS overview

In this section, I give an overview of how AEGIS work. Only necessary details needed to solve the challenge are presented. For the full specifications, please refer to this [link](https://eprint.iacr.org/2013/695.pdf). Also note that, _"state transition function"_ in this section is equivalent to _"state update function"_ in the reference paper.

### 2.1 The overall picture

AEGIS is an AEAD (authenticated encryption with associated data) cipher. The whole encryption process can be split into 4 phases: _initialization, authentication of associated data, plaintext encryption_ and _finalization_, as shown in [Figure 1](#fig-1). Inputs to the process are a key, an IV, a plaintext and some associated data that needs to be authenticated (but not encrypted). Outputs are the plaintext in encrypted form (ciphertext) and an authentication tag.

<a name="fig-1"></a>![AEGIS encryption](images/aegis.png)  
_Figure 1: AEGIS encryption_.

#### Initialization

The initial state of an AEGIS cipher is derived from the key, the IV and some predefined constants. This state then transits for a number of times `t` (hence the initial state is denoted as `s[-t]`) using `ip[i]`(`i = 0,1,...t-1`), which are also derived from the key and the IV, as state transition parameters. The purpose of this phase is to make sure that the internal state after initialization `s[0]` looks like a random value, although partial information about the key and the IV may be known to attackers. Therefore, `k` will be chosen depends on how fast a change in the key or IV propagates inside the state for each transition.

#### Authentication of associated data

Associated data is cut into blocks `a[0]`, `a[1]`, ... and the internal state continues to transit `u` times more (`u` is the length of the associated data in blocks), using those blocks as state transition parameters.

#### Plaintext encryption

Just like the previous phase, plaintext data is cut into blocks `m[0]`, `m[1]`, ... and these blocks are used as parameters to make the internal state transit `v` times more (`v` is the length of the plaintext in blocks). Beside that, during this phase, a key block generation function `kgf` is applied to the internal states `s[u+i]`, `i = 0,1,...,v-1` to form a key stream `k[0]`, `k[1]`,..., which is then used to XOR-ed with the plaintext to produce the ciphertext `c[0]`, `c[1]`, ....

#### Finalization

The internal state continues to transit for a number of times, default to 6, to the final state `s[u+v+6]`, parameterized by a single value `fp`, which is derived from `s[u+v]` (the state at the beginning of this phase) and the lengths of the plaintext and associated data. A tag generation function `tgf` is then applied to the final state to obtain the authentication tag `at`. The whole encryption process is now completed.

Although the AEGIS decryption process is not described, we can easily see that it is mostly the same as the encryption process, except that the ciphertext blocks `c[0]`, `c[1]`, ... are XOR'ed with the key stream first, before being used as state transition parameters (in the encryption process, the plaintext blocks are directly used as state transition parameters). There's also an additional step that requires comparing the two authentication tags (the actually computed one versus the one comes with the ciphertext).

Now, let's delve into the state transition function `stf`.

### 2.2 The state transition function

Each AEGIS state `s[i]` consists of `n` 16-bytes words: `s[i][0]`, `s[i][1]`, ..., `s[i][n-1]` ([Figure 2](#fig-2)).

<a name="fig-2"></a>![](images/state.png)  
_Figure 2: AEGIS internal state._

The state transition function `stf` makes use of the AES round function `AESRound` that takes as input two 16-byte words to produce another 16-byte word: `z = AESRound(x, y) = R(x) ^ y`, in which `R` is `AESRound` without the _AddRoundKey_ step. When the state transition parameter is omitted (or equal to the all-zeroes word), the AEGIS state transition function can be simply described as follows:
-   `s[i+1][0] = R(s[i][d-1]) ^ s[i][0]`
-   `s[i+1][1] = R(s[i][0])   ^ s[i][1]`
-   `s[i+1][2] = R(s[i][1])   ^ s[i][2]`
-   ...

Or more intuitively in [Figure 3](#fig-3):

<a name="fig-3"></a>![](images/stf-no-param.png)  
_Figure 3: AEGIS state transition function without parameter._

What the state transition parameter `p[i]` do is that it consists of `d` 16-byte words: `p[i][0]`, `p[i][1]`, ..., `p[i][d-1]` and those words are used to XOR-ed with the words from the internal state right before the state transition function completes. Usually, `n` (the number of words in a state) is divisible by `d`, and the parameter words are "evenly distributed" into the state words: `s[i+1][j * n/d] ^= p[i][j]`, `j = 0,1,...,d-1`. This way, a change in the parameter can propagate through the state words as quickly as possible.

Before closing this section, let's take a look at the state transition functions of the two concrete instantiations of AEGIS that are mentioned in the challenge, namely AEGIS-128 ([Figure 4](#fig-4)) AEGIS-128L ([Figure 5](#fig-5)).

<a name="fig-4"></a>![](images/aegis-128.png)  
_Figure 4: AEGIS-128 (AEGIS with n=5, d=1) state transition function._

<a name="fig-5"></a>![](images/aegis-128l.png)  
_Figure 5: AEGIS-128L (AEGIS with n=8, d=2) state transition function._

As a result, AEGIS-128 has state size of 80 bytes and state transition parameter size of 16 bytes. For AEGIS-128L, the state size and parameter size are 128 bytes and 32 bytes, respectively.

### 2.3 The key block generation function

The key block generation function `kgf` is used to apply to a state to produce one key block. The key block size must be the same as the state transition parameter size (since key blocks are XOR-ed with plaintext blocks, and plaintext blocks are used as state transition parameters), which is `16*d` bytes.

`kgf` should not involve the state words that get XOR-ed with the parameter words. If those words are involved, the relation between the parameters and the key blocks becomes less confused, thus violates the confusion principle for a secure cipher. For example, let `n=4` and `d=1`, if `kgf(s[i]) = s[i][0] ^ s[i][1] ^ s[i][2] ^ s[i][3]` (XOR all the state words together), bit flips in an parameter will cause bit flips at the same positions in the next generated key block. This property will certainly become a huge source of different attacks.

The key block generation function for AEGIS-128 is `kgf(s[i]) = s[i][1] ^ (s[i][2] & s[i][3]) ^ s[i][4]` (`s[i][0]` is omitted). For AEGIS-128L, it is the concatenation of `k[i][0] = s[i][1] ^ (s[i][2] & s[i][3]) ^ s[i][6]` and `k[i][1] = s[i][2] ^ (s[i][6] & s[i][7]) ^ s[i][5]` (`s[i][0]` and `s[i][4]` are omitted).

### 2.4 The advantages and disadvantages

Although the information provided in this sub-section is not needed for solving the challenge, I think it is quite interesting and worth taking away after reading the write-up, so I dedicate a few words for it. The advantages and disadvantages listed below are geared towards the performance aspect of AEGIS, especially when compared to any AEAD cipher suite that involves AES (for example, AES128/CBC/HMAC/SHA256).

#### Advantages:
-   Generally fast on processors that support AES-related instructions.
-   To process a 16-byte plaintext/ciphertext block, AEGIS needs to make `n/d`, which is 4 (AEGIS-128L), 5 (AEGIS-128) or 6 (AEGIS-256), calls to `AESRound`. For AES, this number is 10 (AES-128), 12 (AES-192) or 14 (AES-256). Additionally, those calls can be made in parallel.
-   The overhead for computing an authentication tag is fixed. For any AEAD cipher suite (I know) that involves AES, this overhead scales linearly with the plaintext/ciphertext length.

#### Disadvantages:
-   Not so efficient on small input.
-   Encryption/decryption of plaintext/ciphertext blocks cannot be parallelized.

Now, let's see what an attacker can do when an AEGIS key-IV pair is reused.

## 3. Key-IV pair reuse attack on AEGIS

AES Round function is the basis from which AEGIS is constructed. Therefore, understanding of this function is required in order to understand the attack.

### 3.1 Cryptanalysis of the AES Round function

Recall that the AES Round function `AESRound` operates on 16-byte words. It takes as input 2 words to produce another word, and consists of 4 steps: _SubBytes_, _ShiftRows_, _MixColumns_ and _AddRoundKey_. When the _AddRoundKey_ step is omitted, we have a function denoted by `R` that consists of the 3 remaining steps and the AES Round function can now be described as: `z = AESRound(x, y) = R(x) ^ y` ([Figure 6](#fig-6)).

<a name="fig-6"></a>![](images/aesround.png)  
_Figure 6: The AES Round function_

If we treat each word as a vector of 16-dimensional vector space over F256 (the finite field of size 256), _AddRoundKey_ is simply vector addition, _ShiftRows_ and _MixColumns_ are just linear operators that multiply the input vector with suitable 16x16 matrices. The only subtlety we have to deal with lies in the _SubBytes_ step, which replaces each component of the input vector with another element from F256 based on a predefined substitution box.

What's so good (for us) about a linear operator `f` is that, if the difference between two inputs is `∆`, the difference between the two outputs will be `f(∆)` and vice versa. Applying to our situation, if we know the difference between two outputs of the `R` function (`∆x3` as in [Figure 6](#fig-6)), we can deduce the difference between the two temporary vectors (`x1` and `x1 + ∆x1`) right after the _SubBytes_ step: `∆x1 = InvShiftRows(InvMixColumns(∆x3))`.

Additionally, if we also know the difference between the two input vectors to `R` (`∆x` as in [Figure 6](#fig-6)), we can solve for `x` by brute-forcing each component `x[i]` until `SubByte(x[i]) ^ SubByte(x[i] ^ ∆x[i]) == ∆x1[i])`, `i = 0,1,...,15`. However, since the equation above always has an even number of solutions (if `x[i] = a` is a solution, `x[i] = a ^ ∆x[i]` is another solution), we will have, not a unique solution, but a set of at least `2**16 = 65536` candidates for `x`. 

So, just to summarize, if we know `∆x` and `∆R(x)`, we can solve for `x` easily!

### 3.2 Cryptanalysis of AEGIS with key-IV pair reuse

When a key-IV pair is reused, the internal state of an AEGIS cipher after initialization (`s[0]` as in [Figure 1](#fig-1)) stays the same. Further state transitions only depend on the supplied inputs, including the associated data and the plaintext/ciphertext, which are under control of the attacker in the traditional chosen plaintext/ciphertext attack model. Moreover, the attacker is able to observe the produced key stream by simply XOR-ing the plaintext and the ciphertext (in a plaintext-ciphertext pair) together.

The attacker's goal is to recover the internal state, which means that he can encrypt/decrypt arbitrary plaintext/ciphertext of his choice without accessing to the oracle. Can the attacker do this? The answer is yes and the attack may generally work as follows (also, see [Figure 7](#fig-7)):
1.   Inject a difference into the state transition parameter word `p[0][0]` by flipping bits in the first 16 bytes of the plaintext/ciphertext sent to the oracle (assume that the associated data is null).
2.   Since `p[0][0]` changes, the state word `s[1][0]` changes and `∆s[1][0] = ∆p[0][0]`. However, the key block generation function does not involve `s[1][0]` (see [Section 2.3](#2.3-The-key-block-generation-function)). Therefore, `k[1]` remains unchanged.
3.   Since `s[1][0]` changes, both `s[2][0]` and `s[2][1]` change.
4.   Since `s[2][1]` changes, the key block `k[2]` changes. The difference `∆k[2]` is observed.
5.   Deduce `∆s[2][1]` from `∆k[2]`.
6.   Since `s[1][1]` does not change, `∆R(s[1][0]) = ∆s[2][1]`.
7.   Compute a set of possible values for `s[1][0]` from `∆s[1][0]` and `∆R(s[1][0])` (see [Section 3.1](#3.1-Cryptanalysis-of-the-AES-Round-function)).
8.   Repeat the above steps with another difference injected to `p[0][0]` until `s[1][0]` can be uniquely determined.

<a name="fig-7"></a>![](images/change-propagation.png)  
_Figure 7: Propagation of a difference in AEGIS state transition parameter._

Now, the attacker has successfully recovered `s[1][0]`. He can similarly recover `s[2][0]`, `s[3][0]`, ... and use the relations of the words defined by the state transition function to deduce the remaining unknown words of `s[1]`. For example, `s[1][n-1] = invR(s[2][0] ^ p[1][0] ^ s[1][0])`.

The attack is totally practical. Now, let's head back to our challenge!

## 4. Solving the challenge

The main challenge consists of two sub-challenges. The first one asks us to perform chosen plaintext attack on AEGIS-128L (see [Section 2.2](#2.2-The-state-transition-function) and [Section 2.3](#2.3-The-key-block-generation-function) if you want to look back at its state transition and key block generation functions) with key-IV pair reuse.

### 4.1 Sub-challenge 1: Chosen plaintext attack on AEGIS-128L with key-IV pair reuse

For each connection to the oracle, a randomly generated key-IV pair is used to initialize an AEGIS-128L cipher. This cipher is then used to encrypt up to 7 plaintexts of our choice. The goal is to recover the internal state `s[1]`, knowing that `p[0]` is the all zeroes block.

Our attack strategy is as follows (also, see [Figure 8](#fig-8)):
1.  Let the original state transition parameters `p[0]`, `p[1]`, `p[2]` are all-zeroes blocks.
2.  Use 1 oracle call to get the key blocks `k[1]`, `k[2]`, `k[3]`, `k[4]` corresponding to those parameters (in fact, `k[1]` only depends on the initial key-IV pair, but we still need it for step 6).
3.  Inject two different differences to all bytes of `p[0]`, observe changes in `k[2]` and deduce `s[1][0]` and `s[1][4]` (using the technique described in [Section 3.2](#3.2-Cryptanalysis-of-AEGIS-with-key-IV-pair-reuse)). This costs 2 oracle calls.
4.  Similarly, use the remaining 4 oracle calls to deduce `s[2][0]`, `s[2][4]` (by injecting differences to `p[1]`) and `s[3][0]`, `s[3][4]` (by injecting differences to `p[2]`).
5.  Based on the relations between the state words as defined in the state transition function, deduce `s[2][3]` and `s[2][7]`, then `s[1][3]` and `s[1][7]`, then `s[1][2]` and `s[1][6]`.
6.  Based on the relations between the state words and the words in the output key block as defined in the key block generation function, deduce `s[1][1]` and `s[1][5]`. The internal state `s[1]` is now fully recovered.

<a name="fig-8"></a>![](images/aegis-128l-attack.png)  
_Figure 8: An attack strategy for AEGIS-128L with key-IV pair reuse._

It's time to implement the attack!

First of all, let's write down `R`, `invR` and the function that uniquely solves for an input `x` of `R`, given 2 pairs of differences: `diff_in_1`, `diff_out_1` and `diff_in_2`, `diff_out_2`:

In [1]:
import aes  # https://raw.githubusercontent.com/boppreh/aes/master/aes.py


def R(x):
    tmp = aes.bytes2matrix(x)
    aes.sub_bytes(tmp)
    aes.shift_rows(tmp)
    aes.mix_columns(tmp)
    return aes.matrix2bytes(tmp)


def invR(x3):
    tmp = aes.bytes2matrix(x3)
    aes.inv_mix_columns(tmp)
    aes.inv_shift_rows(tmp)
    aes.inv_sub_bytes(tmp)
    return aes.matrix2bytes(tmp)


def solve_x(diff_in_1, diff_out_1, diff_in_2, diff_out_2):
    # precondition for x to be unique
    assert(all(diff_in_1[i] != diff_in_2[i] for i in range(16)))
    
    # aliases
    dx_1, dx_2 = diff_in_1, diff_in_2
    dx3_1, dx3_2 = diff_out_1, diff_out_2
    
    # calculate dx1_1
    tmp1 = aes.bytes2matrix(dx3_1)
    aes.inv_mix_columns(tmp1)
    aes.inv_shift_rows(tmp1)
    dx1_1 = aes.matrix2bytes(tmp1)
    
    # calculate dx1_2
    tmp2 = aes.bytes2matrix(dx3_2)
    aes.inv_mix_columns(tmp2)
    aes.inv_shift_rows(tmp2)
    dx1_2 = aes.matrix2bytes(tmp2)
    
    # brute-force for each component x[i]
    x = bytearray(16)
    for i in range(16):
        xi = set()
        for c in range(256):
            if (
                aes.s_box[c] ^ aes.s_box[c ^ dx_1[i]] == dx1_1[i] and
                aes.s_box[c] ^ aes.s_box[c ^ dx_2[i]] == dx1_2[i]
            ):
                xi.add(c)
                
        # make sure there's a unique solution for each component
        assert(len(xi) == 1)
        x[i] = xi.pop()
        
    return bytes(x)


# testing
import os
from aegis import _xor
x = os.urandom(16)
diff_in_1 = b'\x01' * 16
diff_in_2 = b'\x02' * 16
diff_out_1 = _xor(R(x), R(_xor(x, diff_in_1))) 
diff_out_2 = _xor(R(x), R(_xor(x, diff_in_2)))
print(x == solve_x(diff_in_1, diff_out_1, diff_in_2, diff_out_2))

True


Looks good! Now, let's connect to the server/oracle:

In [2]:
from socket import create_connection
from base64 import b64encode, b64decode

HOST = "localhost"
PORT = 5555
_s = create_connection((HOST, PORT))
_f = _s.makefile()
_f.readline()  # ignore the welcome message
_f.readline()  # ignore the IV too (knowing only the IV is useless)


def oracle1(pt, aad):
    _s.sendall(b64encode(pt) + b'\n')
    _s.sendall(b64encode(aad) + b'\n')
    ct = b64decode(_f.readline())
    tag = b64decode(_f.readline())
    return ct, tag

Perform step 1 & 2:

In [3]:
# the original state transition parameters
p0 = p1 = p2 = b'\x00' * 32

# prepare the plaintext and associated data to be sent to the oracle
pt = p0 + p1 + p2 
aad = b''

# to observe the desired key blocks, we need to make the plaintext 2-block longer
pt += b'\x00' * 32 * 2

# make the oracle call
ct, _ = oracle1(pt, aad)  # the tag can be ignored
k0 = ct[32 * 0: 32 * 1]
k1 = ct[32 * 1: 32 * 2]
k2 = ct[32 * 2: 32 * 3]
k3 = ct[32 * 3: 32 * 4]
k4 = ct[32 * 4: 32 * 5]

The following function solves for `s[i+1][0]` and `s[i+1][4]`, given 2 `∆p[i]`/`∆k[i+2]` pairs:

In [4]:
def aegis_128l_partial_state_recover(pair1, pair2):
    # assuming i = 0
    ds10 = []  # ∆s[1][0]
    ds14 = []  # ∆s[1][4]
    dRs10 = []  # ∆R(s[1][0])
    dRs14 = []  # ∆R(s[1][4])
    for pair in (pair1, pair2):
        dp0, dk2 = pair
        dp00, dp01 = dp0[:16], dp0[16:]
        dk20, dk21 = dk2[:16], dk2[16:]
        ds10.append(dp00)  # ∆s[1][0] == ∆p[0][0] 
        ds14.append(dp01)  # ∆s[1][4] == ∆p[0][1] 
        dRs10.append(dk20)  # ∆R(s[1][0]) == ∆k[2][0] 
        dRs14.append(dk21)  # ∆R(s[1][4]) == ∆k[2][1] 
        
    s10 = solve_x(ds10[0], dRs10[0], ds10[1], dRs10[1])
    s14 = solve_x(ds14[0], dRs14[0], ds14[1], dRs14[1])
    return s10, s14

Perform step 3:

In [5]:
pairs = []
for dp0 in (b'\x01' * 32, b'\x02' * 32):
    m = _xor(p0, dp0) + b'\x00' * 32 * 2
    c, _ = oracle1(m, b'')
    dk2 = _xor(k2, c[32 * 2: 32 * 3])
    pairs.append((dp0, dk2))
s10, s14 = aegis_128l_partial_state_recover(*pairs)    

Perform step 4:


In [6]:
pairs = []
for dp1 in (b'\x01' * 32, b'\x02' * 32):
    m = p0 + _xor(p1, dp1) + b'\x00' * 32 * 2
    c, _ = oracle1(m, b'')
    dk3 = _xor(k3, c[32 * 3: 32 * 4])
    pairs.append((dp1, dk3))
s20, s24 = aegis_128l_partial_state_recover(*pairs) 

pairs = []
for dp2 in (b'\x01' * 32, b'\x02' * 32):
    m = p0 + p1 + _xor(p2, dp2) + b'\x00' * 32 * 2
    c, _ = oracle1(m, b'')
    dk4 = _xor(k4, c[32 * 4: 32 * 5])
    pairs.append((dp2, dk4))
s30, s34 = aegis_128l_partial_state_recover(*pairs) 

Perform step 5:

In [7]:
s23 = invR(_xor(s34, s24))
s27 = invR(_xor(s30, s20))

s13 = invR(_xor(s24, s14))
s17 = invR(_xor(s20, s10))

s12 = invR(_xor(s23, s13))
s16 = invR(_xor(s27, s17))

Perform step 6:

In [8]:
from aegis import _and
k10, k11 = k1[:16], k1[16:]
s11 = _xor(_xor(k10, s16), _and(s12, s13))
s15 = _xor(_xor(k11, s12), _and(s16, s17))
s1 = s10 + s11 + s12 + s13 + s14 + s15 + s16 + s17

Let's send the already recovered state to the server.

In [9]:
_s.sendall(b64encode(s1) + b'\n')
print(_f.readline().strip())

OK


Now, the first sub-challenge has been solved. Coming to the next one, we need to perform chosen ciphertext attack on AEGIS-128 (again, see [Section 2.2](#2.2-The-state-transition-function) and [Section 2.3](#2.3-The-key-block-generation-function) if you want to look back at its state transition and key block generation functions) with key-IV pair reuse.

### 4.2 Sub-challenge 2: Chosen ciphertext attack on AEGIS-128 with key-IV pair reuse

Continue with the second sub-challenge, another random key-IV pair is generated and is used to initialize an AEGIS-128 cipher. This time, the associated data is fixed and its value is given to us. Beside that, the ciphertext corresponding to a random ASCII string of size 96 is also given.

In [10]:
_f.readline()  # ignore the starting message
_f.readline()  # ignore the iv
aad = b64decode(_f.readline())  # the associated data
ct = b64decode(_f.readline())  # the ciphertext
_f.readline();  # ignore the tag

For AEGIS-128, each state transition parameter consists of only 1 word, so we will stick with with the notation `p[i]` to refer to that word instead of `p[i][0]`. Similarly, we will use `m[i]`, `k[i]`, `c[i]` instead of `m[i][0]`, `k[i][0]`, `c[i][0]`.

In [11]:
c0 = ct[16 * 0: 16 * 1]
c1 = ct[16 * 1: 16 * 2]
c2 = ct[16 * 2: 16 * 3]
c3 = ct[16 * 3: 16 * 4]
c4 = ct[16 * 4: 16 * 5]
c5 = ct[16 * 5: 16 * 6]

For each oracle call, we can only specify a ciphertext (since the associated data has been fixed), and the oracle only returns the position and the exact value of the first non-ASCII byte in the corresponding plaintext. We can 
have up to 231 oracle calls.

In [12]:
import re

calls_made = 0

# the 2nd oracle
def oracle2(ct):
    global calls_made
    calls_made += 1
    assert calls_made <= 231
    
    _s.sendall(b64encode(ct) + b'\n')
    tmp = re.findall("codec can't decode byte (.+) in position (\d+)", _f.readline())
    if tmp:
        value = int(tmp[0][0], 16)
        position = int(tmp[0][1])
        return position, value
    else:  
        # no non-ASCII byte found in the corresponding plaintext
        return None, None

Our goal is, again, to recover the internal state of the cipher, right after the first parameter under our control gets processed (`s[u+1]` as in [Figure 1](#fig-1)).

For now, let's define a proper `decrypt` function on top of the weird oracle call:

In [13]:
def decrypt(ct, offset=0):
    """
    Decrypt input ciphertext starting at `offset` using calls to oracle2.
    
    If `offset` is specified, this function will assume that the first `offset` bytes
    in the corresponding plaintext (`pt[:offset]`) are ASCII bytes.
    
    When a non-ASCII byte `c` is encountered, this function will only attempt to decrypt
    at most 15 bytes more. The reason is, we need to flip the most significant bit in `c`
    to avoid its value being printed out repeatedly. This introduces a change in the
    internal state of the AEGIS cipher and may cause the decryption of the subsequent
    bytes to be incorrect. However, the upcoming 15 bytes (maybe more) after `c` are 
    guaranteed to be unaffected.
    
    For n bytes need to be decrypted, this function will make n+1 or n calls to `oracle2`, 
    depends on whether the last byte in the corresponding plaintext is an ASCII byte or 
    not.
    """
    if offset == len(ct):
        return bytearray()
    
    # find for the first non-ASCII byte in the corresponding plaintext
    position, value = oracle2(ct)
    
    if position is None: # all bytes are ASCII ones
        position = len(ct)
    else:
        # make sure there's no non-ASCII byte before `offset`
        assert position >= offset  
        
        # make sure we won't go too far away from the first encountered non-ASCII byte
        assert position + 16 >= len(ct)  
                 
    # fetch all bytes in the range (offset, pos)
    pt = bytearray(position - offset)
    for i in range(offset, position):
        ct_copy = bytearray(ct)
        ct_copy[i] ^= 0x80  # flip the MSB of ct[i]
        p, v = oracle2(ct_copy)
        assert p == i
        pt[i - offset] = v ^ 0x80

    if value is None: 
        # there's nothing left to do
        return pt
    else:
        # make a recursive call to fetch the remaining bytes
        next_ct = bytearray(ct)
        next_ct[position] ^= 0x80
        return pt + bytearray([value]) + decrypt(next_ct, position + 1)

Let's try to decrypt the ciphertext given to us:

In [14]:
pt = decrypt(ct)
m0 = pt[16 * 0: 16 * 1]
m1 = pt[16 * 1: 16 * 2]
m2 = pt[16 * 2: 16 * 3]
m3 = pt[16 * 3: 16 * 4]
m4 = pt[16 * 4: 16 * 5]
m5 = pt[16 * 5: 16 * 6]

Then, deduce the key blocks:

In [15]:
k = _xor(pt, ct)
k0 = k[16 * 0: 16 * 1]
k1 = k[16 * 1: 16 * 2]
k2 = k[16 * 2: 16 * 3]
k3 = k[16 * 3: 16 * 4]
k4 = k[16 * 4: 16 * 5]
k5 = k[16 * 5: 16 * 6]

Similar to what we have done with AEGIS-128L, our attack strategy for AEGIS-128 is as follows (also, see [Figure 9](#fig-9)):
1.  Let the original state transition parameters `p[u]`, `p[u+1]`, `p[u+2]`, `p[u+3]` be our already decrypted plaintext above. So we can inject differences to `p[i]` through the given ciphertext.
2.  The original key blocks have already been computed: `k[1]`, `k[2]`, `k[3]`, `k[4]`, `k[5]`, which already cost us 97 oracle calls (can be reduced to 96 if the MSB of the last byte of the ciphertext is flipped before the ciphertext is passed to the `decrypt` function).
3.  Inject two different differences to all bytes of `p[u]` (by injecting the differences to `c[0]`), observe changes in `k[2]` and deduce `s[u+1][0]`. This costs 2 * (16~17) oracle calls.
4.  Similarly, deduce `s[u+2][0]`, `s[u+3][0]`, `s[u+4][0]`. This costs 3 * 2 * (16~17) oracle calls.
5.  Based on the relations between the state words as defined in the state transition function, deduce `s[u+3][4]`, `s[u+2][4]`, `s[u+1][4]`, then `s[u+2][3]`, `s[u+1][3]`, and then `s[u+1][2]`.
6.  Based on the relation between the state words and the word in the output key block as defined in the key block generation function, deduce `s[u+1][1]`. The internal state `s[u+1]` is now fully recovered.

So, in total, we need around 224 ~ 232 oracle calls to launch the attack. With 231 oracle calls at hand, our attack has a high probability to succeed.

<a name="fig-9"></a>![](images/aegis-128-attack.png)  
_Figure 9: An attack strategy for AEGIS-128 with key-IV pair reuse._

The following function solves for `s[u+i+1][0]`, given 2 `∆c[i]`/`∆k[i+2]` pairs:

In [16]:
def aegis_128_partial_state_recover(pair1, pair2):
    # assuming i = 0
    ds10 = []  # ∆s[u+1][0]
    dRs10 = []  # ∆Rs[u+1][0]
    for pair in pairs:
        dc0, dk2 = pair
        ds10.append(dc0)  # ∆s[u+1][0] = ∆p[u] = ∆m[0] = ∆c[0]
        dRs10.append(dk2)  # ∆R(s[u+1][0]) = ∆k[2]
    return solve_x(ds10[0], dRs10[0], ds10[1], dRs10[1])

Perform step 3:

In [17]:
pairs = []
for dc0 in (b'\x01' * 16, b'\x02' * 16):
    m2_ = decrypt(_xor(c0, dc0) + c1 + c2, offset=16*2)
    dk2 = _xor(m2, m2_)
    pairs.append((dc0, dk2))
s10 = aegis_128_partial_state_recover(*pairs)  # s[u+1][0]

Perform step 4:

In [18]:
pairs = []
for dc1 in (b'\x01' * 16, b'\x02' * 16):
    m3_ = decrypt(c0 + _xor(c1, dc1) + c2 + c3, offset=16*3)
    dk3 = _xor(m3, m3_)
    pairs.append((dc1, dk3))
s20 = aegis_128_partial_state_recover(*pairs)  # s[u+2][0]

pairs = []
for dc2 in (b'\x01' * 16, b'\x02' * 16):
    m4_ = decrypt(c0 + c1 + _xor(c2, dc2) + c3 + c4, offset=16*4)
    dk4 = _xor(m4, m4_)
    pairs.append((dc2, dk4))
s30 = aegis_128_partial_state_recover(*pairs)  # s[u+3][0]

pairs = []
for dc3 in (b'\x01' * 16, b'\x02' * 16):
    m5_ = decrypt(c0 + c1 + c2 + _xor(c3, dc3) + c4 + c5, offset=16*5)
    dk5 = _xor(m5, m5_)
    pairs.append((dc3, dk5))
s40 = aegis_128_partial_state_recover(*pairs)  # s[u+4][0]

Perform step 5:

In [19]:
s34 = invR(_xor(_xor(s30, s40), m3))  # s[u+3][4]
s24 = invR(_xor(_xor(s20, s30), m2))  # s[u+2][4]
s14 = invR(_xor(_xor(s10, s20), m1))  # s[u+1][4]

s23 = invR(_xor(s24, s34))  # s[u+2][3]
s13 = invR(_xor(s14, s24))  # s[u+1][3]

s12 = invR(_xor(s13, s23))  # s[u+1][2]

Perform step 6:

In [20]:
s11 = _xor(_xor(k1, s14), _and(s12, s13))  # s[u+1][1]
s1 = (s10, s11, s12, s13, s14)  # s[u+1]

Now, use the already recovered state to encrypt `DATA`, then send the result to the server to get flag.

In [21]:
from aegis import Aegis128
DATA = b"""
Hear your fate, O dwellers in Sparta of the wide spaces;

Either your famed, great town must be sacked by Perseus' sons,
Or, if that be not, the whole land of Lacedaemon
Shall mourn the death of a king of the house of Heracles,
For not the strength of lions or of bulls shall hold him,
Strength against strength; for he has the power of Zeus,
And will not be checked until one of these two he has consumed
"""
S, ct = Aegis128.raw_encrypt(s1, DATA)
aad += m0
tag = Aegis128.finalize(S, len(aad) * 8, len(DATA) * 8)
if calls_made < 231:
    _s.sendall(b"challenge\n")
_s.sendall(b64encode(ct) + b'\n')
_s.sendall(b64encode(aad) + b'\n')
_s.sendall(b64encode(tag) + b'\n')
print(_f.readline().strip())

Congrats! Flag: flag{test}


There was an interesting story during the competition that explained why we had 2 different oracles: "oracle2.2020.ctfcompetition.com" and "oracle.2020.ctfcompetition.com". The later was published first, but it was somehow mis-configured: In the second sub-challenge, for each oracle call, it returned, not just 1 byte of the plaintext, but the whole plaintext by sending back the UnicodeDecodeError object in pickle serialization format (the plaintext is one of its attributes), thus making the challenge easier to solve (at least, we don't need to bother with the limitation in the number of oracle calls). However, we still need to understand the attack in order to complete the challenge.

Since some teams had already solved the challenge with the weaker oracle, the organizers decided to keep it, together with the newly deployed one.

## 5. References
-   Wu H., Preneel B. (2014) AEGIS: A Fast Authenticated Encryption Algorithm.