# Differential Fault Analysis on the hardware AES co-processor

In this notebook we will target the HW AES co-processor using Differential Fault Analysis. Similar to the side-channel analysis notebook our goal is to extract the cryptographic key from our target device.
The target implementation is very similar to the one used in the side-channel analysis notebook, but in this case the target is using a hardcoded AES key. When we send the device a 16-byte plaintext it will respond with a 16-byte ciphertext. Our goal is to inject single byte faults before the last MixColumns so that only 4 ciphertext bytes are corrupted.

To perform the actual analysis we use [phoenixAES](https://github.com/SideChannelMarvels/JeanGrey/tree/master/phoenixAES) from the [Side-Channel Marvels](https://github.com/SideChannelMarvels) project. If you would like to learn more about how this attack works under the hood you can start by looking at the following resources:

* [Differential Fault Analysis on White-box AES Implementations](https://blog.quarkslab.com/differential-fault-analysis-on-white-box-aes-implementations.html) - Quarkslab - Philippe Teuwen and Charles Hubain
* [FAULT201 - Lab 1-3A - DFA Attack Against Last MixColumns](https://github.com/newaetech/chipwhisperer-jupyter/blob/master/courses/fault201/Lab%201_3A%20-%20DFA%20Attack%20Against%20Final%20MixColumns.ipynb) - NewAE ChipWhisperer tutorial
* [Differential Fault Analysis on A.E.S.](https://eprint.iacr.org/2003/010.pdf) - Dusart et al.



## Hardware Setup
To run this notebook you will need the hardware outlined in the main README of the repository. You will also need to modify your target for voltage glitching, at a minimum this requires removing C19 and connecting the ChipWhisperer glitch port to the DCOUPL pin. All instructions for modifying the targets are provided in the main README.

Make sure to connect your ChipWhisperer to the modified LAUNCHXL-CC2640R2 board before running the notebook.
* Connect the target SMA connector to your ChipWhisperer's Glitch/Crowbar port
* Remove the 3V3 jumper and connect the target side pin to the ChipWhisperer's 3V3 output
* Remove the RESET jumper and connect the target side to the ChipWhisperer's NRST output
* Connect the ChipWhisperer's IO4/TRG to the target's DIO6 pin
* Connect the ChipWhisperer's ground to a ground pin on the target board


## Preparation

The following cells load the required libraries and initialise the ChipWhisperer as well as our target.

In [1]:
import sys
import time
import os
import numpy as np
import chipwhisperer as cw
from tqdm.notebook import tqdm
import serial
import matplotlib.pyplot as plt

ser = 0

In [2]:
# Connect to the ChipWhisperer and perform some basic initialization

scope = cw.scope()

scope.clock.clkgen_src = 'system' 
scope.clock.clkgen_freq = 200e6          # Main ChipWhisperer clock
scope.clock.adc_mul = 1
scope.trigger.triggers = 'tio4'          # Trigger on a rising edge of TIO4 (connected to DIO6)
scope.adc.basic_mode = 'rising_edge'
scope.io.target_pwr = True

scope.glitch.enabled = True
scope.glitch.clk_src = 'pll'
#scope.clock.pll.update_fpga_vco(600e6)
scope.glitch.output = 'enable_only'
scope.glitch.trigger_src = 'ext_single'
scope.io.glitch_lp = True                # only using the 'low power' glitch mosfet
scope.io.glitch_hp = False
scope.glitch.ext_offset = 300            # Glitch offset from the external trigger (in cycles of the main CW clock)

In [3]:
# Connect to the LAUNCHXL-CC2640R2 UART
# You may have to change the serial port ('/dev/ttyACM1')

if ser:
    ser.close()

ser = serial.Serial('/dev/ttyACM1', 115200)

In [4]:
# Modify the dslite_path variable to point to your installation of Uniflash
# Running this cell will load the example target firmware
# THIS WILL OVERWRITE THE FIRMWARE ON YOUR LAUNCHXL-CC2640R2

import subprocess
from pathlib import Path

home_dir = str(Path.home()) 
dslite_path = home_dir + '/ti/uniflash_7.0.0/dslite.sh'
erase_cmd = dslite_path + ' --mode cc13xx-cc26xx-mass-erase -d XDS110'
flash_cmd = dslite_path + ' --config ./bin/CC2640R2F.ccxml --flash ./bin/VFI_SCA_CC2640R2.out' 

process = subprocess.Popen(erase_cmd.split(' '), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
output = process.communicate()

if b'Device Unlocked' not in output[0]:
    print('There was an error while trying to erase the microcontroller')
    print(output)
else:
    scope.io.nrst = 'low'
    scope.io.target_pwr = False
    time.sleep(0.1)
    scope.io.target_pwr = True
    scope.io.nrst = 'high'
    
    process = subprocess.Popen(flash_cmd.split(' '), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    output = process.communicate()
    if b'Board Reset Complete' in output[0]:
        print('Target has been flashed!')
    else:
        print('Error flashing target. Check your connections and try again.')

Target has been flashed!


In [5]:
# Simple function to reset the target microcontroller
def reset_dut(delay=0.1):
    scope.io.nrst = 'low'
    scope.io.target_pwr = False
    time.sleep(delay)
    scope.io.target_pwr = True
    scope.io.nrst = 'high'
    time.sleep(0.05)
    ser.flushInput()
    ser.write(b'h') # To select the hardware aes function of the firmware
    

# A more thorough reset function that verifies that the target is alive again
def thorough_reset_dut(delay=0.05): 
    reset_dut(delay)
    
    ser.flushInput()
    ser.write(bytes([0xAA]*16))
    time.sleep(0.05)
    ret = ser.read(ser.in_waiting)
        
    while len(ret) != 16:
        delay += 0.5
        reset_dut(delay)
        ser.write(bytes([0xAA]*16))
        time.sleep(0.05)

    ret = ser.read(ser.in_waiting)

In [6]:
# Recall from the introduction that we send a 16-byte plaintext to the target 
# The target responds with a 16-byte ciphertext, we do not know the key used by our target.

reset_dut()
pt = bytes([0xAA]*16)

ser.flushInput()
ser.write(pt)
time.sleep(0.05)
ct = ser.read(ser.in_waiting)

print('Plaintext: ', pt.hex())
print('Ciphertext:', ct.hex())

Plaintext:  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Ciphertext: 7117d95886a44a8bcf16570335cad7d6


## Can we fault the HW AES?

Before we are going to try and extract the secret key we will first try to determine if we can achieve faulty outputs using voltage fault injection.

Note that in the target firmware the trigger is as close to the actual AES operation as possible. So in this case there will probably not be any faults that cause the key to leak (see section 5.2 of the paper).

Within a few minutes you should start seeing faulty ciphertext outputs, starting at offset ~600 you should start seeing faulted ciphertexts that have quite a few bytes in common with the reference ciphertext.

In [7]:
scope.glitch.repeat = 18
offsets = np.arange(0, 1000, 5)
repeats = 5
resets = 0
faults = 0

pt = [0xAA]*16
reset_dut(delay=1)
time.sleep(0.05)
ser.write(bytes(pt))
time.sleep(0.05)
baseline = ser.read(ser.in_waiting)
print('Normal output:', baseline.hex())

for offset in tqdm(range(len(offsets))):
    scope.glitch.ext_offset = offsets[offset]
        
    for i in range(repeats):
        scope.arm()

        ser.write(bytes(pt))
        time.sleep(0.05)
        ret = ser.read(ser.in_waiting)

        if ret != baseline:
            if ret == b'':
                resets += 1
            else:
                faults += 1
                if len(ret) == 16:
                    print('Fault?!', offsets[offset], ret.hex())
                else:
                    print('Weird?!', offsets[offset], ret.hex(), len(ret))

            thorough_reset_dut()
                
    if offsets[offset] % 100 == 0:
        print(offsets[offset], resets, faults)
                
total = len(offsets)*repeats
print("Total # attempts:", total) 
print("Total # faults: %d (%f%%)" % (faults, (faults/total)*100))
print("Total # resets: %d (%f%%)" % (resets, (resets/total)*100))

Normal output: 7117d95886a44a8bcf16570335cad7d6


  0%|          | 0/200 [00:00<?, ?it/s]

0 3 0
Fault?! 75 de045428c92142ee8cc109023d3b133e
100 79 1
Fault?! 120 6eac6a770fe1bdf68b9c00ebbf6c1dcf
Fault?! 120 6eac6a770fe1bdf68b9c00ebbf6c1dcf
Fault?! 120 6eac6a770fe1bdf68b9c00ebbf6c1dcf
200 124 4
Fault?! 270 106c54ed0a4028308faf8ff4322d2b9c
Fault?! 295 925194bf23024faef8ed4066abe67f75
Fault?! 300 66cf5f009cb8e2942d6287152ebd5e2e
300 168 7
Fault?! 370 df46394c4b88c26a2d8bec84cc23fa4b
Fault?! 385 eceb393632a45ab02183cefc9340ec9a
Fault?! 395 6ab4560b3c4ed0ef789da583ef56a9ff
400 225 10
Fault?! 440 7790900d18fe4179cd2359fccc782f08
Fault?! 440 7a6d88bbdd2d45926641d2c775a5bf57
Fault?! 495 9fd3626effac3bed9940685233ea1203
Fault?! 500 0e4d198998f26d8ff20401fd2027680a
500 293 14
Fault?! 510 bd7041bb527303b917005ea65fe37f0c
Fault?! 515 f8ff17aad14bdc82ece53a999e6d85db
Fault?! 515 52274f7bd3b754375d8eb3cc695fe475
Fault?! 520 2b6516e099623fc9392c4deba99c0905
Fault?! 530 998488035d7ea4da6a81c25b48eb9dfb
Fault?! 535 cf2987f7a9563235a28ef9010f86d479
Fault?! 535 3ff3331e4ed6d6ea17646609b00bbfc7

## Are there any faulty ciphertexts that can be easily exploited?

As explained earlier, one of the easiest to exploit faults is one that introduces an error in one state byte before the last mixcolumns. Such faults will result in 4 bytes being different (compared to the reference output) and can thus be detected easily in this demo setting. Note that we are not just looking for 4 faulted bytes, they should also be in the correct location.

This fault injection campaign should produce a few outputs in which 4-bytes are different. As an example the following output shows an exploitable fault:
```
Fault at offset: 593 , # bytes: 4
7817d95886a44a2fcf1606033504d7d6 <-- Faulted output
09000000000000a40000510000ce0000 <-- XOR between reference and faulted output
```

In [8]:
# Simple function to count number of faulted bytes
def get_diff(base, fault):
    diff = bytearray([0]*16)
    cnt = 0
    for i in range(16):
        b = fault[i] ^ base[i]
        diff[i] = b
        if b != 0:
            cnt += 1
            
    return diff, cnt

In [9]:
scope.glitch.repeat = 18
offsets = np.arange(580, 730, 1)
repeats = 50
resets = 0
faults = 0

pt = [0xAA]*16
reset_dut(delay=1)
time.sleep(0.05)
ser.write(bytes(pt))
time.sleep(0.05)
baseline = ser.read(ser.in_waiting)
print('Normal output:', baseline.hex())

dfa_faults = [] # Used to store all faulty ciphertexts

for offset in tqdm(range(len(offsets))):
    scope.glitch.ext_offset = offsets[offset]
        
    for i in range(repeats):
        scope.arm()

        ser.write(bytes(pt))
        time.sleep(0.05)
        ret = ser.read(ser.in_waiting)

        if ret != baseline:
            if ret == b'':
                resets += 1
            else:
                faults += 1
                if len(ret) == 16:
                    dfa_faults.append(ret.hex())
                    diff, cnt = get_diff(baseline, ret)
                    print('Fault at offset:', offsets[offset], '- # bytes:', cnt)
                    print(ret.hex())
                    print(diff.hex(), '\n')

            thorough_reset_dut()
                
    if offsets[offset] % 100 == 0:
        print(offsets[offset], resets, faults)
                
total = len(offsets)*repeats
print("Total # attempts:", total) 
print("Total # faults: %d (%f%%)" % (faults, (faults/total)*100))
print("Total # resets: %d (%f%%)" % (resets, (resets/total)*100))

Normal output: 7117d95886a44a8bcf16570335cad7d6


  0%|          | 0/150 [00:00<?, ?it/s]

Fault at offset: 580 - # bytes: 16
b8fadcb0848ce27e9396fc1178fe0749
c9ed05e80228a8f55c80ab124d34d09f 

Fault at offset: 581 - # bytes: 16
1da2edd0140829fc6be0411c9d7285d0
6cb5348892ac6377a4f6161fa8b85206 

Fault at offset: 581 - # bytes: 16
e683b640159c3959db2ac3093cc84297
97946f18933873d2143c940a09029541 

Fault at offset: 581 - # bytes: 16
15b3aa006e2e09e665f5cee83a18d29d
64a47358e88a436daae399eb0fd2054b 

Fault at offset: 581 - # bytes: 16
5c9af46ba1f92016d438c8c6be21055e
2d8d2d33275d6a9d1b2e9fc58bebd288 

Fault at offset: 582 - # bytes: 16
e683b640159c3959db2ac3093cc84297
97946f18933873d2143c940a09029541 

Fault at offset: 582 - # bytes: 16
1da2edd0140829fc6be0411c9d7285d0
6cb5348892ac6377a4f6161fa8b85206 

Fault at offset: 586 - # bytes: 16
24449c1952e1642cca2fad003c3458aa
55534541d4452ea70539fa0309fe8f7c 

Fault at offset: 587 - # bytes: 16
432a50fa447ceec8bf77b2f24aa5490e
323d89a2c2d8a4437061e5f17f6f9ed8 

Fault at offset: 588 - # bytes: 16
c5423b2c45f1b6ffb0438877339f72a2
b455e

Fault at offset: 600 - # bytes: 8
7199c658649a4a8b301657b235ca3dea
008e1f00e23e0000ff0000b10000ea3c 

Fault at offset: 600 - # bytes: 4
7117925886cb4a8b9a16570335cad72f
00004b00006f000055000000000000f9 

Fault at offset: 600 - # bytes: 8
71f5b658809c4a8bdb16570e35ca0d97
00e26f00063800001400000d0000da41 

Fault at offset: 600 - # bytes: 8
7105ec582a4e4a8b0616573335ca6d69
00123500acea0000c90000300000babf 

Fault at offset: 600 - # bytes: 8
71dfb658ca9c4a8bdb16572e35ca6b97
00c86f004c3800001400002d0000bc41 

Fault at offset: 600 - # bytes: 4
7156d958d4a44a8bcf16574235ca7dd6
0041000052000000000000410000aa00 

Fault at offset: 600 - # bytes: 8
718f875895194a8b1616575b35caa2ed
00985e0013bd0000d90000580000753b 

Fault at offset: 600 - # bytes: 8
7145b658989c4a8bdb16572d35ca9b97
00526f001e3800001400002e00004c41 

Fault at offset: 600 - # bytes: 8
718c3758fd954a8b3d16577835ca7add
009bee007b310000f200007b0000ad0b 

Fault at offset: 600 - # bytes: 4
71174d5886274a8be916570335cad78f
000094000083000

Fault at offset: 621 - # bytes: 1
7117d95886a4c88bcf16570335cad7d6
00000000000082000000000000000000 

Fault at offset: 623 - # bytes: 1
7117d95886a44a8bef16570335cad7d6
00000000000000002000000000000000 

Fault at offset: 624 - # bytes: 2
7117d95886a44a8b9216f00335cad7d6
00000000000000005d00a70000000000 

Fault at offset: 624 - # bytes: 2
7117d95886a44a8b8216530335cad7d6
00000000000000004d00040000000000 

Fault at offset: 624 - # bytes: 2
7117d95886a44a8b9016b20335cad7d6
00000000000000005f00e50000000000 

Fault at offset: 624 - # bytes: 3
7117d95886a44a8bd415f60335cad7d6
00000000000000001b03a10000000000 

Fault at offset: 624 - # bytes: 1
7117d95886a44a8bef16570335cad7d6
00000000000000002000000000000000 

Fault at offset: 624 - # bytes: 3
7117d95886a44a8bd015f60335cad7d6
00000000000000001f03a10000000000 

Fault at offset: 624 - # bytes: 2
7117d95886a44a8b8216530335cad7d6
00000000000000004d00040000000000 

Fault at offset: 624 - # bytes: 3
7117d95886a44a8bd415f60335cad7d6
000000000000000

We now have a list of faulty ciphertexts, the following cell selects the ciphertexts that appear to have the faulty pattern we want.

In [10]:
dfa_faults_r9 = []

mask1 = [0,0xFF,0xFF,0xFF,  0xFF,0xFF,0xFF,0,  0xFF,0xFF,0,0xFF,  0xFF,0,0xFF,0xFF]
mask2 = [0xFF,0,0xFF,0xFF,  0,0xFF,0xFF,0xFF,  0xFF,0xFF,0xFF,0,  0xFF,0xFF,0,0xFF]
mask3 = [0xFF,0xFF,0,0xFF,  0xFF,0,0xFF,0xFF,  0,0xFF,0xFF,0xFF,  0xFF,0xFF,0xFF,0]
mask4 = [0xFF,0xFF,0xFF,0,  0xFF,0xFF,0,0xFF,  0xFF,0,0xFF,0xFF,  0,0xFF,0xFF,0xFF]

for idx, d in enumerate(set(dfa_faults)):
    b = bytearray.fromhex(d)
    diff, cnt = get_diff(baseline, b)

    r1 = sum([mask1[i] & diff[i] for i in range(16)])
    r2 = sum([mask2[i] & diff[i] for i in range(16)])
    r3 = sum([mask3[i] & diff[i] for i in range(16)])
    r4 = sum([mask4[i] & diff[i] for i in range(16)])
    
    if r1 == 0 or r2 == 0 or r3 == 0 or r4 == 0:
        dfa_faults_r9.append(d)

The next cell writes the reference ciphertext and the faulty ciphertexts to a file in the format expected by [phoenixAES](https://github.com/SideChannelMarvels/JeanGrey/tree/master/phoenixAES). Afterwards we use phoenixAES to try and recover the last round key.

In [11]:
import phoenixAES

with open('r9faults', 'w') as f:
    f.write(baseline.hex() + '\n')
    for fault in set(dfa_faults_r9):
        f.write(fault + '\n')
    
dfa_res = phoenixAES.crack_file("r9faults")
print(dfa_res)

F64E80..D1F9..A523..4C24..0297FD


## What if the attack fails?

There are many ways in which this notebook can be improved, and in some cases this notebook may fail to recover the full key from the faulty ciphertexts. There are a few things you can do to recover the remaining key bytes:

* Try to get more distinct faulty ciphertexts and try again
* Try to attack round 8
* Try to use the [scripts by Yifan Lu](https://github.com/SideChannelMarvels/JeanGrey/tree/master/phoenixAES-yifan) to try all combinations of faulty outputs.
* Or if you are missing 4 bytes simply use [SideChannelMarvels - Hulk](https://github.com/SideChannelMarvels/Hulk) to brute force the remaining bytes.

### Example using Hulk
The output saved in this notebook shows an example of an incomplete key recovery.

`F64E80..D1F9..A523..4C24..0297FD`

Using Hulk we can easily brute force the remaining bytes.

`./hulk d 7117d95886a44a8bcf16570335cad7d6 aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa f64e80??d1f9??a523??4c24??0297fd 10`

The output from Hulk is shown below, using 8 AES-NI units Hulk took about 34s to determine the remaining 4 key bytes. Note that Hulk outputs (`T05 Key found`) the first round key.

```
Hulk v0.1 (pgarba 2018)
[*] AES-NI is supported by this CPU!
[*] Round           : 10
[*] Mode            : Decryption
[*] Key             : F64E80??D1F9??A523??4C24??0297FD
[*] Input           : 7117d95886a44a8bcf16570335cad7d6
[*] Expected        : aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
[!] Bruteforce      : 4 missing bytes
[*] Byte 0 Index 3
[*] Byte 1 Index 6
[*] Byte 2 Index 9
[*] Byte 3 Index 12
[*] AES-NI Units    : 8
[*] Range           : 00000000 - FFFFFFFF
[*] Step            : 1FFFFFFF
[*] T00 Range       : 00000000 - 1FFFFFFF
[*] T01 Range       : 20000000 - 3FFFFFFF
[*] T02 Range       : 40000000 - 5FFFFFFF
[*] T03 Range       : 60000000 - 7FFFFFFF
[*] T04 Range       : 80000000 - 9FFFFFFF
[*] T05 Range       : A0000000 - BFFFFFFF
[*] T06 Range       : C0000000 - DFFFFFFF
[*] T07 Range       : E0000000 - FFFFFFFF
[!] T05 Key found   : 0123456789abcdef123456789abcdef0
[*] Output          : aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
[!] Valid key!
```