# Tutorial A5 (Breaking AES-256 Bootloader)

This tutorial will take you through a complete attack on an encrypted bootloader using AES-256. This demonstrates how to use side-channel power analysis on practical systems, along with discussing how to perform analysis with different Analyzer models.

## Background

In the world of microcontrollers, a bootloader is a special piece of firmware that is made to let the user upload new programs into memory. This is especially useful for devices with complex code that may need to be patched or otherwise updated in the future - a bootloader makes it possible for the user to upload a patched version of the firmware onto the micro. The bootloader receives information from a communication line (a USB port, serial port, ethernet port, WiFi connection, etc...) and stores this data into program memory. Once the full firmware has been received, the micro can happily run its updated code.

There is one big security issue to worry about with bootloaders. A company may want to stop their customers from writing their own firmware and uploading it onto the micro. For example, this might be for protection reasons - hackers might be able to access parts of the device that weren't meant to be accessed. One way of stopping this is to add encryption. The company can add their own secret signature to the firmware code and encrypt it with a secret key. Then, the bootloader can decrypt the incoming firmware and confirm that the incoming firmware is correctly signed. Users will not know the secret key or the signature tied to the firmware, so they won't be able to "fake" their own.

This tutorial will work with a simple AES-256 bootloader. The victim will receive data through a serial connection, decrypt the command, and confirm that the included signature is correct. Then, it will only save the code into memory if the signature check succeeded. To make this system more robust against attacks, the bootloader will use cipher-block chaining (CBC mode). Our goal is to find the secret key and the CBC initialization vector so that we could successfully fake our own firmware.

### Bootloader Communications Protocol

The bootloader's communications protocol operates over a serial port at 38400 baud rate. The bootloader is always waiting for new data to be sent in this example; in real life one would typically force the bootloader to enter through a command sequence.

Commands sent to the bootloader look as follows:

```
       |<-------- Encrypted block (16 bytes) ---------->|
       |                                                |
+------+------+------+------+------+------+ .... +------+------+------+
| 0x00 |    Signature (4 Bytes)    |  Data (12 Bytes)   |   CRC-16    |
+------+------+------+------+------+------+ .... +------+------+------+
```

This frame has four parts:

* `0x00`: 1 byte of fixed header
* Signature: A secret 4 byte constant. The bootloader will confirm that this signature is correct after decrypting the frame.
* Data: 12 bytes of the incoming firmware. This system forces us to send the code 12 bytes at a time; more complete bootloaders may allow longer variable-length frames.
* CRC-16: A 16-bit checksum using the CRC-CCITT polynomial (0x1021). The LSB of the CRC is sent first, followed by the MSB. The bootloader will reply over the serial port, describing whether or not this CRC check was valid.

As described in the diagram, the 16 byte block is not sent as plaintext. Instead, it is encrypted using AES-256 in CBC mode. This encryption method will be described in the next section.

The bootloader responds to each command with a single byte indicating if the CRC-16 was OK or not:

```
            +------+
CRC-OK:     | 0xA1 |
            +------+

            +------+
CRC Failed: | 0xA4 |
            +------+
```
Then, after replying to the command, the bootloader veries that the signature is correct. If it matches the expected manufacturer's signature, the 12 bytes of data will be written to flash memory. Otherwise, the data is discarded.

### Details of AES-256 CBC

The system uses the AES algorithm in Cipher Block Chaining (CBC) mode. In general one avoids using encryption 'as-is' (i.e. Electronic Code Book), since it means any piece of plaintext always maps to the same piece of ciphertext. Cipher Block Chaining ensures that if you encrypted the same thing a bunch of times it would always encrypt to a new piece of ciphertext.

You can see another reference on the design of the encryption side; we'll be only talking about the decryption side here. In this case AES-256 CBC mode is used as follows, where the details of the AES-256 Decryption block will be discussed in detail later:

![AES-256](https://wiki.newae.com/images/8/88/Aes256_cbc.png)

This diagram shows that the output of the decryption is no longer used directly as the plaintext. Instead, the output is XORed with a 16 byte mask, which is usually taken from the previous ciphertext. Also, the first decryption block has no previous ciphertext to use, so a secret initialization vector (IV) is used instead. If we are going to decrypt the entire ciphertext (including block 0) or correctly generate our own ciphertext, we'll need to find this IV along with the AES key.

### Attacking AES-256

The system in this tutorial uses AES-256 encryption, which has a 256 bit (32 byte) key - twice as large as the 16 byte key we've attacked in previous tutorials. This means that our regular AES-128 CPA attacks won't quite work. However, extending these attacks to AES-256 is fairly straightforward: the theory is explained in detail in Extending AES-128 Attacks to AES-256.

As the theory page explains, our AES-256 attack will have 4 steps:

1. Perform a standard attack (as in AES-128 decryption) to determine the first 16 bytes of the key, corresponding to the 14th round encryption key.
1. Using the known 14th round key, calculate the hypothetical outputs of each S-Box from the 13th round using the ciphertext processed by the 14th round, and determine the 16 bytes of the 13th round key manipulated by inverse MixColumns.
1. Perform the MixColumns and ShiftRows operation on the hypothetical key determined above, recovering the 13th round key.
1. Using the AES-256 key schedule, reverse the 13th and 14th round keys to determine the original AES-256 encryption key.

## Firmware

For this tutorial, we'll be using the `bootloader-aes256` project, which we'll build as usual:

In [393]:
PLATFORM = "CWLITEARM"
CRYPTO_TARGET="NONE"

In [461]:
%%bash -s "$PLATFORM" "$CRYPTO_TARGET"
cd ../../hardware/victims/firmware/bootloader-aes256
make PLATFORM=$1 CRYPTO_TARGET=$2

rm -f -- bootloader-aes256-CWLITEARM.hex
rm -f -- bootloader-aes256-CWLITEARM.eep
rm -f -- bootloader-aes256-CWLITEARM.cof
rm -f -- bootloader-aes256-CWLITEARM.elf
rm -f -- bootloader-aes256-CWLITEARM.map
rm -f -- bootloader-aes256-CWLITEARM.sym
rm -f -- bootloader-aes256-CWLITEARM.lss
rm -f -- objdir/*.o
rm -f -- objdir/*.lst
rm -f -- bootloader.s aes256.s crcccitt.s simpleserial.s stm32f3_hal.s stm32f3_hal_lowlevel.s stm32f3_sysmem.s
rm -f -- bootloader.d aes256.d crcccitt.d simpleserial.d stm32f3_hal.d stm32f3_hal_lowlevel.d stm32f3_sysmem.d
rm -f -- bootloader.i aes256.i crcccitt.i simpleserial.i stm32f3_hal.i stm32f3_hal_lowlevel.i stm32f3_sysmem.i
.
-------- begin --------
arm-none-eabi-gcc (15:6.3.1+svn253039-1build1) 6.3.1 20170620
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

.
Compiling C: bootloader.c
arm-none-eabi-gcc -c

## Capturing Traces

### Setup

To start, we'll proceed with setup as usual:

In [153]:
%run "Helper Scripts/CWLite_Connect.ipynb"

In [154]:
%run "Helper Scripts/Setup_Target_Generic.ipynb"

In [387]:
# uncomment based on your target
fw_path = "../../hardware/victims/firmware/bootloader-aes256/bootloader-aes256-CWLITEARM.hex"
#%run "Helper Scripts/Program_XMEGA.ipynb"
%run "Helper Scripts/Program_STM.ipynb"
#%run "Helper Scripts/No_Programmer.ipynb"

In [462]:
program_target(scope, fw_path)

Detected known STMF32: STM32F302xB(C)/303xB(C)
Extended erase (0x44), this can take ten seconds or more
Attempting to programming 5919 bytes at 0x8000000
STM32F Programming flash...
STM32F Reading flash...
Verified flash OK, 5919 bytes


### Calculating the CRC

The next step we'll need to take in attacking this target is to communicate with it. Most of the transmission is fairly straight forward, but the CRC is a little tricky. Luckily, there's a lot of open source out there for calculating CRCs. In this case, we'll pull some code from pycrc:

In [157]:
# Class Crc
#############################################################
# These CRC routines are copy-pasted from pycrc, which are:
# Copyright (c) 2006-2013 Thomas Pircher <tehpeh@gmx.net>
#
class Crc(object):
    """
    A base class for CRC routines.
    """

    def __init__(self, width, poly):
        """The Crc constructor.

        The parameters are as follows:
            width
            poly
            reflect_in
            xor_in
            reflect_out
            xor_out
        """
        self.Width = width
        self.Poly = poly


        self.MSB_Mask = 0x1 << (self.Width - 1)
        self.Mask = ((self.MSB_Mask - 1) << 1) | 1

        self.XorIn = 0x0000
        self.XorOut = 0x0000

        self.DirectInit = self.XorIn
        self.NonDirectInit = self.__get_nondirect_init(self.XorIn)
        if self.Width < 8:
            self.CrcShift = 8 - self.Width
        else:
            self.CrcShift = 0

    def __get_nondirect_init(self, init):
        """
        return the non-direct init if the direct algorithm has been selected.
        """
        crc = init
        for i in range(self.Width):
            bit = crc & 0x01
            if bit:
                crc ^= self.Poly
            crc >>= 1
            if bit:
                crc |= self.MSB_Mask
        return crc & self.Mask


    def bit_by_bit(self, in_data):
        """
        Classic simple and slow CRC implementation.  This function iterates bit
        by bit over the augmented input message and returns the calculated CRC
        value at the end.
        """
        # If the input data is a string, convert to bytes.
        if isinstance(in_data, str):
            in_data = [ord(c) for c in in_data]

        register = self.NonDirectInit
        for octet in in_data:
            for i in range(8):
                topbit = register & self.MSB_Mask
                register = ((register << 1) & self.Mask) | ((octet >> (7 - i)) & 0x01)
                if topbit:
                    register ^= self.Poly

        for i in range(self.Width):
            topbit = register & self.MSB_Mask
            register = ((register << 1) & self.Mask)
            if topbit:
                register ^= self.Poly

        return register ^ self.XorOut
    
bl_crc = Crc(width = 16, poly=0x1021)

Now we can easily get the CRC for our message by calling `bl_crc.bit_by_bit(message)`. 

### Communicating with the Bootloader

With that done, we can start communicating with the bootloader. Recall that the bootloader expects:
* To start with `0x00`
* A 16 byte encrypted message (4 bytes signature + 12 bytes data)
* CRC16

We don't really care what the 16 byte message is (just that each is different so that we get a variety of hamming weights), so we'll use the same text/key module from earlier attacks.

We can now run the following block, and we should get `0xA4` back. You may need to run this block a few times to get the right response back.

In [170]:
from chipwhisperer.capture.acq_patterns.basic import AcqKeyTextPattern_Basic
import time
message = [0x00]
ktp = AcqKeyTextPattern_Basic(target=target)

# clear serial buffer
num_char = target.ser.inWaiting()
print(target.ser.read(num_char))

key, text = ktp.newPair() #don't care about key here
message.extend(text)

crc = bl_crc.bit_by_bit(text)

message.append(crc >> 8)
message.append(crc & 0xFF)

target.ser.write(message)
time.sleep(0.1)

num_char = target.ser.inWaiting()
response = target.ser.read(num_char)
print("Response: {:02X}".format(ord(response[0])))

¡¡
Response: A4


### Capturing Traces

With that out of the way, we can proceed to capturing our traces. The normal 5000 traces we capture isn't long enough to get the rounds we care about, so we'll need to increase it (11000 should be fine):

In [102]:
scope.adc.samples = 11000

We'll be working with Analyzer, so we'll need to use a ChipWhisperer project to store our traces and text:

In [23]:
from chipwhisperer.common.api.ProjectFormat import ProjectFormat
project = cw.createProject("projects/Tutorial A5", overwrite=True)
tc = project.getTraceFormat()
ktp = AcqKeyTextPattern_Basic(target=target)

Below you'll find our capture loop. This will be pretty similar to Tutorial B5, but we've added our communication code. We also check the response and just skip the data we get if it isn't correct.

In [24]:
#Capture Traces
from tqdm import tqdm
from chipwhisperer.capture.acq_patterns.basic import AcqKeyTextPattern_Basic
import numpy as np
import time
keys = []
N = 100  # Number of traces
target.init()
for i in tqdm(range(N), desc='Capturing traces'):
    message = [0x00]
    
    num_char = target.ser.inWaiting()
    target.ser.read(num_char)
    
    key, text = ktp.newPair()  # manual creation of a key, text pair can be substituted here
    keys.append(key)
    
    message.extend(text)
    
    crc = bl_crc.bit_by_bit(text)
    message.append(crc >> 8)
    message.append(crc & 0xFF)

    # run aux stuff that should run before the scope arms here

    scope.arm()

    # run aux stuff that should run after the scope arms here

    target.ser.write(message)
    timeout = 50
    # wait for target to finish
    while target.isDone() is False and timeout:
        timeout -= 1
        time.sleep(0.01)

    try:
        ret = scope.capture()
        if ret:
            print('Timeout happened during acquisition')
    except IOError as e:
        print('IOError: %s' % str(e))

    # run aux stuff that should happen after trace here
    num_char = target.ser.inWaiting()
    response = target.ser.read(num_char)
    if ord(response[0]) != 0xA4:
        # Bad response, just skip
        print("Bad response: {:02X}".format(ord(response[0])))
        continue
    
    tc.addTrace(scope.getLastTrace(), text, "", key)
    
tc._isloaded = True
project.traceManager().appendSegment(tc)

Capturing traces: 100%|██████████| 100/100 [00:12<00:00,  7.71it/s]


With that, we're done with capturing traces! We can now disconnect from the hardware:

In [None]:
scope.dis()
target.dis()

## Analysis

Now that we have our traces, we can go ahead and perform the attack. As described in the background theory, we'll have to do two attacks - one to get the 14th round key, and another (using the first result) to get the 13th round key. Then, we'll do some post-processing to finally get the 256 bit encryption key.

### 14th Round Key

We can attack the 14th round key with a standard, no-frills CPA attack (using the inverse sbox, since it's a decryption that we're breaking):

In [25]:
import chipwhisperer as cw
from chipwhisperer.analyzer.attacks.cpa import CPA
from chipwhisperer.analyzer.attacks.cpa_algorithms.progressive import CPAProgressive
from chipwhisperer.analyzer.attacks.models.AES128_8bit import AES128_8bit, InvSBox_output

tm = project.traceManager()

attack = CPA()
leak_model = AES128_8bit(InvSBox_output)
attack.setAnalysisAlgorithm(CPAProgressive, leak_model)
attack.setTraceSource(tm)
attack.setTraceStart(0)
attack.setTracesPerAttack(tm.numTraces())
attack.setIterations(1)
attack.setReportingInterval(10)
attack.setTargetSubkeys([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])

With the setup done, we can actually preform the attack. 11000 samples is a rather large amount to chew through, so if you want a faster attack you can use a smaller range in `attack.setPointRange()`. `(2900, 4200)` will work for XMEGA, while `(1400, 2600)` will work for the STM32F3 (CWLite ARM).

Below you'll find the key that we should recover from this attack. You may want to check what we actually get against this key to make sure the attack is working.

In [29]:
key = [0xea, 0x79, 0x79, 0x20, 0xc8, 0x71, 0x44, 0x7d, 0x46, 0x62, 0x5f, 0x51, 0x85, 0xc1, 0x3b, 0xcb]
#key = keys[0]

In [30]:
import pandas as pd
def format_stat(stat):
    return str("{:02X}<br>{:.3f}".format(stat[0], stat[2]))

def color_corr_key(row):
    global key
    ret = [""] * 16
    for i,bnum in enumerate(row):
        if bnum[0] == key[i]:
            ret[i] = "color: red"
        else:
            ret[i] = ""
    return ret

from IPython.display import clear_output
import numpy as np
        
def stats_callback():
    attack_results = attack.getStatistics()
    attack_results.setKnownkey(key)
    stat_data = attack_results.findMaximums()
    df = pd.DataFrame(stat_data).transpose()
    clear_output(wait=True)
    display(df.head().style.format(format_stat).apply(color_corr_key,axis=1))
    
attack.setPointRange((0, -1))
attack_results = attack.processTracesNoGUI(stats_callback)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
0,EA 0.880,79 0.831,79 0.883,20 0.894,C8 0.869,71 0.883,44 0.861,7D 0.870,46 0.905,62 0.878,5F 0.860,51 0.872,85 0.871,C1 0.838,3B 0.867,CB 0.879
1,3F 0.521,15 0.487,BA 0.488,D0 0.486,5F 0.505,74 0.497,CF 0.513,E9 0.482,9B 0.507,3B 0.478,48 0.482,85 0.475,CA 0.481,A1 0.513,1A 0.512,EC 0.477
2,35 0.487,44 0.473,C5 0.471,0C 0.481,15 0.471,A2 0.476,9C 0.493,3C 0.467,54 0.503,12 0.473,26 0.462,AF 0.465,96 0.476,AB 0.487,0B 0.508,E4 0.460
3,84 0.483,B1 0.463,06 0.471,9C 0.478,8C 0.468,53 0.469,58 0.482,CD 0.461,FA 0.496,EC 0.467,DC 0.460,5B 0.455,55 0.471,83 0.484,99 0.464,F3 0.442
4,DB 0.472,2D 0.460,DB 0.460,4C 0.477,6B 0.466,5A 0.468,A9 0.476,4A 0.461,86 0.484,4E 0.459,16 0.456,C7 0.454,35 0.458,E0 0.476,74 0.461,91 0.441


### 13th Round Key

Analyzer doesn't have a leakage model for the 13th round key built in, so we'll need to create our own. An example class is shown below along with the beginning of the setup. **NOTE: You'll need to update `calc_round_key` with the key you found in the last step**

In [31]:
import chipwhisperer as cw
from chipwhisperer.analyzer.attacks.cpa import CPA
from chipwhisperer.analyzer.attacks.cpa_algorithms.progressive import CPAProgressive
from chipwhisperer.analyzer.attacks.models.AES128_8bit import AES128_8bit, AESLeakageHelper
from chipwhisperer.analyzer.preprocessing.resync_sad import ResyncSAD

class AES256_Round13_Model(AESLeakageHelper):
    def leakage(self, pt, ct, guess, bnum):
        #You must put YOUR recovered 14th round key here - this example may not be accurate!
        calc_round_key = [0xea, 0x79, 0x79, 0x20, 0xc8, 0x71, 0x44, 0x7d, 0x46, 0x62, 0x5f, 0x51, 0x85, 0xc1, 0x3b, 0xcb]
        xored = [calc_round_key[i] ^ pt[i] for i in range(0, 16)]
        block = xored
        block = self.inv_shiftrows(block)
        block = self.inv_subbytes(block)
        block = self.inv_mixcolumns(block)
        block = self.inv_shiftrows(block)
        result = block
        return self.inv_sbox((result[bnum] ^ guess[bnum]))
    
attack = CPA()
leak_model = AES128_8bit(AES256_Round13_Model)
attack.setAnalysisAlgorithm(CPAProgressive, leak_model)
attack.setTraceSource(project.traceManager())

#### Resyncing Traces (XMEGA Only)

The traces for the XMEGA version of the firmware become desynced around sample 7000. This is due to a non-constant AES implementation: the code does not always take the same amount of time to run for every input. (It's actually possible to do a timing attack on this AES implementation! We'll stick with our CPA attack for now.)

While this does open up a timing attack, it actually makes our AES attack a little harder, since we'll have to resync the traces. Luckily, this can be done pretty easily by using the ResyncSAD preprocessing module:

In [None]:
resync_traces = ResyncSAD(tm)
resync_traces.enabled = True
resync_traces.ref_trace = 0
resync_traces.target_window = (9100, 9300)
resync_traces.max_shift = 200
attack.setTraceSource(resync_traces)

#### Running the Attack

Like in the 14th round attack, we can use a smaller range of points to make the attack faster. `(8000,10990)` works well for the XMEGA, while `(6500, 8500)` works well for the STM32F3.

You can run the block below and the correct key should be printed out:

In [32]:
attack.setTraceStart(0)
attack.setTracesPerAttack(tm.numTraces())
attack.setIterations(1)
attack.setReportingInterval(10)
attack.setTargetSubkeys([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
attack.setPointRange((6500,8500))
attack_results = attack.processTracesNoGUI(stats_callback)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
0,C6 0.894,BD 0.902,4E 0.891,50 0.894,AB 0.879,CA 0.876,75 0.894,77 0.887,79 0.887,87 0.889,96 0.845,CA 0.883,1C 0.903,7F 0.866,C5 0.902,82 0.916
1,64 0.495,51 0.462,AB 0.461,B2 0.455,E1 0.439,C7 0.481,5A 0.449,5D 0.474,1B 0.441,F2 0.450,EC 0.461,A4 0.457,29 0.467,39 0.453,ED 0.482,74 0.465
2,59 0.450,0E 0.455,04 0.448,5F 0.451,00 0.436,AF 0.455,DF 0.441,A6 0.471,B8 0.437,DF 0.445,AF 0.455,97 0.444,67 0.444,21 0.429,8F 0.481,19 0.453
3,AB 0.445,8E 0.434,BB 0.445,54 0.445,78 0.435,F7 0.444,0C 0.431,51 0.461,03 0.437,F9 0.438,EE 0.448,85 0.428,6A 0.441,7B 0.421,14 0.447,E6 0.449
4,E9 0.437,16 0.432,E1 0.444,46 0.445,AC 0.429,54 0.434,95 0.430,FD 0.432,71 0.430,74 0.435,F3 0.446,75 0.417,A5 0.436,2A 0.416,B7 0.438,FF 0.439


This, however, isn't actually the 13th round key. To get the real 13th round key, we'll need to run what we've recovered through a `shiftrows()` and `mixcolumns()` operation:

In [35]:
rec_key2 = []
for bnum in attack_results.findMaximums():
    print("Best Guess = 0x{:02X}, Corr = {}".format(bnum[0][0], bnum[0][2]))
    rec_key2.append(bnum[0][0])

from chipwhisperer.analyzer.attacks.models.aes.funcs import shiftrows,mixcolumns
    
real_key2 = shiftrows(rec_key2)
real_key2 = mixcolumns(real_key2)

print("Recovered:", end="")
for subkey in real_key2:
    print(" {:02X}".format(subkey), end="")
print("")

Best Guess = 0xC6, Corr = 0.8937448929000054
Best Guess = 0xBD, Corr = 0.9019453664779397
Best Guess = 0x4E, Corr = 0.891451634170113
Best Guess = 0x50, Corr = 0.8943159714733331
Best Guess = 0xAB, Corr = 0.879475780636866
Best Guess = 0xCA, Corr = 0.8764486807809474
Best Guess = 0x75, Corr = 0.894498765030352
Best Guess = 0x77, Corr = 0.8874011576131537
Best Guess = 0x79, Corr = 0.8872766112055775
Best Guess = 0x87, Corr = 0.8890355947927411
Best Guess = 0x96, Corr = 0.8453476922780022
Best Guess = 0xCA, Corr = 0.8834336711981569
Best Guess = 0x1C, Corr = 0.903127475052072
Best Guess = 0x7F, Corr = 0.8662984398584177
Best Guess = 0xC5, Corr = 0.9016144489417923
Best Guess = 0x82, Corr = 0.9164989946928866
Recovered: C6 6A A6 12 4A BA 4D 04 4A 22 03 54 5B 28 0E 63


We now have everything we need to recover the full key! We'll start by combining the 13th and 14th round keys:

In [36]:
rec_key_comb = real_key2.copy()
rec_key_comb.extend(rec_key)

print("Key:", end="")
for subkey in rec_key_comb:
    print(" {:02X}".format(subkey), end="")
print("")

Key: C6 6A A6 12 4A BA 4D 04 4A 22 03 54 5B 28 0E 63 EA 79 79 20 C8 71 44 7D 46 62 5F 51 85 C1 3B CB


and then we can use the `AES128_8bit` leakage model to recover the first two rounds:

In [37]:
result = leak_model.keyScheduleRounds(rec_key_comb, 13, 0)
result.extend(leak_model.keyScheduleRounds(rec_key_comb, 13, 1))
print("Key:", end="")
for subkey in result:
    print(" {:02X}".format(subkey), end="")
print("")

Key: 94 28 5D 4D 6D CF EC 08 D8 AC DD F6 BE 25 A4 99 C4 D9 D0 1E C3 40 7E D7 D5 28 D4 09 E9 F0 88 A1


You should see a 32 byte key printed out. Open `supersecret.h`, confirm that we have the right key, and celebrate! 

## Recovering the IV

Now that we have the encryption key, we can proceed onto an attack of the next secret value: the IV.

Here, we have the luxury of seeing the source code of the bootloader. This is generally not something we would have access to in the real world, so we'll try not to use it to cheat. (Peeking at `supersecret.h` counts as cheating.) Instead, we'll use the source to help us identify important parts of the power traces.

### Bootloader Source Code

Inside the bootloader's main loop, it does three tasks that we're interested in:

* it decrypts the incoming ciphertext;
* it applies the IV to the decryption's result; and
* it checks for the signature in the resulting plaintext.

This snippet from `bootloader.c` shows all three of the tasks:

```C
// Continue with decryption
trigger_high();                
aes256_decrypt_ecb(&ctx, tmp32);
trigger_low();
             
// Apply IV (first 16 bytes)
for (i = 0; i < 16; i++){
    tmp32[i] ^= iv[i];
}

//Save IV for next time from original ciphertext                
for (i = 0; i < 16; i++){
    iv[i] = tmp32[i+16];
}

// Tell the user that the CRC check was okay
putch(COMM_OK);
putch(COMM_OK);

//Check the signature
if ((tmp32[0] == SIGNATURE1) &&
   (tmp32[1] == SIGNATURE2) &&
   (tmp32[2] == SIGNATURE3) &&
   (tmp32[3] == SIGNATURE4)){
   
   // Delay to emulate a write to flash memory
   _delay_ms(1);
}   
```

This gives us a pretty good idea of how the microcontroller is going to do its job, but if you'd like to go further, you can open the `.lss` file for the binary that was built. This is called a listing file and it lets you see the assembly that the C was compiled and linked to.

### Power Traces

As you can see from both files, after the decryption process, the bootloader executes a few distinct pieces of code:

* To apply the IV, it uses an XOR operation;
* To store the new IV, it copies the previous ciphertext into the IV array;
* It sends two bytes on the serial port;
* It checks the bytes of the signature one by one.

We should be able to recognize these four parts of the code in the power traces. Let's modify our capture routine to find them:

1. We're looking for the original IV, but it's overwritten after each successful decryption. This means we'll have to reset the target before each trace we capture
1. We'd like to skip over all of the decryption process. Recall that the trigger pin is set low after the decryption finishes. This means we can skip over the AES-256 function by triggering on a falling edge instead
1. Depending on the target, we may have to flush the target's serial lines by sending it a bunch of invalid data and looking for a bad CRC return. This slows down the capture process by a lot, so you may want to try without doing this first.
1. We won't need as many samples, so we can reduce how many we capture. 3000 should be sufficient for most targets.

Let's start by reducing our samples and making a function to reset our target (depending on your target, you may need to change the reset pin):

In [109]:
import time
scope.adc.samples = 3000
def reset_target(scope):
    scope.io.nrst = 'low'
    #scope.io.pdic = 'low'
    time.sleep(0.05)
    scope.io.nrst = 'high'
    #scope.io.pdic = 'high'

We can trigger on a falling edge by changing `scope.adc.basic_mode` to `"falling_edge"`:

In [110]:
scope.adc.basic_mode = "falling_edge"

We can flush the serial line by sending an invalid message, then checking for a bad CRC return value (`0xA1`). Let's make sure our changes work by getting a trace:

In [40]:
from bokeh.plotting import figure, show
from bokeh.io import output_notebook
from chipwhisperer.capture.acq_patterns.basic import AcqKeyTextPattern_Basic
reset_target(scope)
message = [0x00]


num_char = target.ser.inWaiting()
target.ser.read(num_char)

key, text = ktp.newPair()  # manual creation of a key, text pair can be substituted here

message.extend(text)

crc = bl_crc.bit_by_bit(text)
message.append(crc >> 8)
message.append(crc & 0xFF)

#flush target's serial
okay = 0
while not okay:
    target.ser.write("\0xxxxxxxxxxxxxxxxxx")
    time.sleep(0.005)
    num_char = target.ser.inWaiting()
    response = target.ser.read(num_char)
    if response:
        if ord(response[0]) == 0xA1:
            okay = 1

scope.arm()

target.ser.write(message)
timeout = 50
# wait for target to finish
while target.isDone() is False and timeout:
    timeout -= 1
    time.sleep(0.01)

try:
    ret = scope.capture()
    if ret:
        print('Timeout happened during acquisition')
except IOError as e:
    print('IOError: %s' % str(e))

# run aux stuff that should happen after trace here
num_char = target.ser.inWaiting()
response = target.ser.read(num_char)
if ord(response[0]) != 0xA4:
    # Bad response, just skip
    print("Bad response: {:02X}".format(ord(response[0])))


trace = scope.getLastTrace()


output_notebook()
p = figure()

xrange = range(len(trace))
p.line(xrange, trace, line_color="red")
show(p)

You should see 5 different sections:

* 16 XORs
* 16 register loads (this is the new IV being copied over)
* Some serial communication
* The signature check
* The serial line going idle

Different targets have different power traces (for example, on Arm the XORs and register loads are almost identical), but hopefully you can pick out where each section is. For example, on XMEGA:

![XMEGA_Bonus_Trace](https://wiki.newae.com/images/f/f6/Tutorial-A5-Bonus-Trace-Notes.PNG)

With all of these things clearly visible, we have a pretty good idea of how to attack the IV and the signature. We should be able to look at each of the XOR spikes to find each of the IV bytes - each byte is processed on its own. Then, the signature check uses a short-circuiting comparison: as soon as it finds a byte in error, it stops checking the remaining bytes. This type of check is susceptible to a timing attack.

With those things done, we can move onto our capture loop. It's pretty similar to our last one. We're done with Analyzer, so we can store our traces in Python lists (we'll convert to numpy arrays later for easy analysis).

In [405]:
from tqdm import tqdm
from chipwhisperer.capture.acq_patterns.basic import AcqKeyTextPattern_Basic
import numpy as np
import time
traces = []
keys = []
plaintexts = []
key, text = ktp.newPair()  # manual creation of a key, text pair can be substituted here
N = 500  # Number of traces
target.init()
for i in tqdm(range(N), desc='Capturing traces'):
    reset_target(scope)
    message = [0x00]
    

    num_char = target.ser.inWaiting()
    target.ser.read(num_char)
    
    key, text = ktp.newPair()  # manual creation of a key, text pair can be substituted here
    keys.append(key)
    plaintexts.append(text)
    
    message.extend(text)
    
    crc = bl_crc.bit_by_bit(text)
    message.append(crc >> 8)
    message.append(crc & 0xFF)

    # run aux stuff that should run before the scope arms here
    
    #flush target's serial
    okay = 0
    while not okay:
        target.ser.write("\0xxxxxxxxxxxxxxxxxx")
        time.sleep(0.005)
        num_char = target.ser.inWaiting()
        response = target.ser.read(num_char)
        if response:
            if ord(response[0]) == 0xA1:
                okay = 1
    scope.arm()

    # run aux stuff that should run after the scope arms here

    target.ser.write(message)
    timeout = 50
    # wait for target to finish
    while target.isDone() is False and timeout:
        timeout -= 1
        time.sleep(0.01)

    try:
        ret = scope.capture()
        if ret:
            print('Timeout happened during acquisition')
            continue
    except IOError as e:
        print('IOError: %s' % str(e))

    # run aux stuff that should happen after trace here
    num_char = target.ser.inWaiting()
    response = target.ser.read(num_char)
    if ord(response[0]) != 0xA4:
        # Bad response, just skip
        print("Bad response: {:02X}".format(ord(response[0])))
        continue
    
    traces.append(scope.getLastTrace())


Capturing traces:   0%|          | 0/500 [00:00<?, ?it/s][A
Capturing traces:   0%|          | 1/500 [00:00<05:27,  1.52it/s][A
Capturing traces:   0%|          | 2/500 [00:01<05:48,  1.43it/s][A
Capturing traces:   1%|          | 3/500 [00:02<05:41,  1.46it/s][A
Capturing traces:   1%|          | 4/500 [00:02<06:09,  1.34it/s][A
Capturing traces:   1%|          | 5/500 [00:03<05:53,  1.40it/s][A
Capturing traces:   1%|          | 6/500 [00:04<06:08,  1.34it/s][A
Capturing traces:   1%|▏         | 7/500 [00:05<06:31,  1.26it/s][A
Capturing traces:   2%|▏         | 8/500 [00:05<06:06,  1.34it/s][A
Capturing traces:   2%|▏         | 9/500 [00:06<05:55,  1.38it/s][A
Capturing traces:   2%|▏         | 10/500 [00:07<06:04,  1.35it/s][A
Capturing traces:   2%|▏         | 11/500 [00:08<05:52,  1.39it/s][A
Capturing traces:   2%|▏         | 12/500 [00:08<06:11,  1.31it/s][A
Capturing traces:   3%|▎         | 13/500 [00:09<06:01,  1.35it/s][A
Capturing traces:   3%|▎         | 14

KeyboardInterrupt: 

### Analysis

#### Attack Theory

The bootloader applies the IV to the AES decryption result by calculating


$\text{PT} = \text{DR} \oplus \text{IV}$

where DR is the decrypted ciphertext, IV is the secret vector, and PT is the plaintext that the bootloader will use later. We only have access to one of these: since we know the AES-256 key, we can calculate DR. This exclusive or should be visible in the power traces

This is enough information for us to attack a single bit of the IV. Suppose we only wanted to get the first bit (number 0) of the IV. We could do the following:

* Split all of the traces into two groups: those with DR[0] = 0, and those with DR[0] = 1.
* Calculate the average trace for both groups.
* Find the difference between the two averages. It should include a noticeable spike during the first iteration of the loop.
* Look at the direction of the spike to decide if the IV bit is 0 `(PT[0] = DR[0])` or if the IV bit is 1 `(PT[0] = ~DR[0])`.

This is effectively a DPA attack on a single bit of the IV. We can repeat this attack 128 times to recover the entire IV.

#### A 1-Bit Attack

Recall that we're looking for the xor operation between the last decrypted block, so we'll need to decrypt it up to that point. The PyCrypto includes an AES decyprtion routine, so we'll be using that. We'll start by importing the necessary modules and converting our traces/plaintext to numpy arrays:

In [171]:
from Crypto.Cipher import AES
import numpy as np

trace_array = np.asarray(traces)  # if you prefer to work with numpy array for number crunching
textin_array = np.asarray(plaintexts)

numTraces = len(trace_array)
traceLen = len(trace_array[0])

Next we'll do the AES256 decryption. If you got a different key in the earlier part, you'll need to change `knownkey`.

In [172]:
knownkey = [0x94, 0x28, 0x5D, 0x4D, 0x6D, 0xCF, 0xEC, 0x08, 0xD8, 0xAC, 0xDD, 0xF6, 0xBE, 0x25, 0xA4, 0x99,
            0xC4, 0xD9, 0xD0, 0x1E, 0xC3, 0x40, 0x7E, 0xD7, 0xD5, 0x28, 0xD4, 0x09, 0xE9, 0xF0, 0x88, 0xA1]

knownkey = bytes(knownkey)
dr = []
aes = AES.new(knownkey, AES.MODE_ECB)
for i in range(numTraces):
    ct = bytes(textin_array[i])
    pt = aes.decrypt(ct)
    d = [bytearray(pt)[i] for i in range(16)]
    dr.append(d)

Now, let's split the traces into two groups by comparing bit 0 of the DR:

In [173]:
groupedTraces = [[] for _ in range(2)]
for i in range(numTraces):
    bit0 = dr[i][0] & 0x01
    groupedTraces[bit0].append(trace_array[i])
print(len(groupedTraces[0]))

1


If you have 1000 traces, you should expect this to print a number around 500 - roughly half of the traces should fit into each group. Now, NumPy's average function lets us easily calculate the average at each point:

In [49]:
# Find averages and differences
means = []
for i in range(2):
    means.append(np.average(groupedTraces[i], axis=0))
diff = means[1] - means[0]

Finally, we can plot this difference to see if we can spot the IV:

In [50]:
# Split traces into 2 groups
from bokeh.plotting import figure, show
from bokeh.io import output_notebook

output_notebook()
p = figure()

xrange = range(len(diff))
xrange2 = range(len(traces[0]))
p.line(xrange, diff, line_color="red")
#p.line(xrange2, traces[0], line_color='blue')
show(p)

You should see a few visible spikes. We're looking for the XOR for byte 0 here, so any later spikes won't be the XOR. Use bokeh's zoom functionality to pinpoint all the largest spikes and record their sample location. You'll probably need to record a few: only one is the correct spike, but we won't be able to tell until we repeat this with other bytes. For example, you might have spikes at 37, 41, and 45. Make sure you record all these values. These peaks won't all be above 0, so make sure you're looking at both positive and negative values.

Next, we'll need to repeat this with a few more bytes. To make things easier, the necessary code has been combined into the below block. Increment the `0` in `bit0 = dr[i][0] & 0x01` to other numbers to attack other bytes. Attacking bytes 0 through 3 should be sufficient.

In [51]:
groupedTraces = [[] for _ in range(2)]
for i in range(numTraces):
    bit0 = dr[i][0] & 0x01
    groupedTraces[bit0].append(trace_array[i])
print(len(groupedTraces[0]))

# Find averages and differences
means = []
for i in range(2):
    means.append(np.average(groupedTraces[i], axis=0))
diff = means[1] - means[0]

# Split traces into 2 groups
from bokeh.plotting import figure, show
from bokeh.io import output_notebook

output_notebook()
p = figure()

xrange = range(len(diff))
xrange2 = range(len(traces[0]))
p.line(xrange, diff, line_color="red")
show(p)

497


Now that you have some peak data, you'll want to use this to find the time shift between XORs. This time shift should be constant between samples and needs to work for all samples (each run through the loop is the same, so it makes sense that the time shift should be constant). For example, you might have:

```
0th byte @ 37, 41
1st byte @ 77, 81
2nd byte @ 105, 117, 121
3rd byte @ 141, 157, 161
4th byte @ 197, 201
```

With this data, peaks at 41, 81, 121, 161, and 201 have a constant time shift of 40. This means the location of the XORs is `41 + 40 * byte#`

#### The Other 127

The best way to attack the IV would be to repeat the 1-bit conceptual attack for each of the bits. Try to do this yourself! (Really!) If you're stuck, here are a few hints to get you going:

One easy way of looping through the bits is by using two nested loops, like this:

```python
for byte in range(16):
    for bit in range(8):
        # Attack bit number (byte*8 + bit)
```

The sample that you'll want to look at will depend on which byte you're attacking. We had success when we used `location = 51 + byte*60`, but your mileage will vary.

The bitshift operator and the bitwise-AND operator are useful for getting at a single bit:

```python
# This will either result in a 0 or a 1
checkIfBitSet = (byteToCheck >> bit) & 0x01
```

If you're really, really stuck, the end of this tutorial has a working script. After finding the IV, check `supersecret.h` and verify that your attack was successful.

## Attacking the Signature

Since the bootloader has different execution times based on the number of correct bytes, it's possible to perform a timing attack similar to the one covered in Tutorial B3. If you're feeling very confident in your SPA skills, you may want to attempt this yourself. This attack, however, won't be covered in this tutorial.

The last thing we can do with this bootloader is attack the signature. This final section will show how one byte of the signature could be recovered. If you want more of this kind of analysis, a more complete timing attack is shown in Tutorial B3-1 Timing Analysis with Power for Password Bypass.

### Attack Theory

Recall from earlier that the signature check in C looks like:

```C
if ((tmp32[0] == SIGNATURE1) &&
    (tmp32[1] == SIGNATURE2) &&
    (tmp32[2] == SIGNATURE3) &&
    (tmp32[3] == SIGNATURE4)){
```

Open the listing (`.lss`) file for your binary and find the above check. You should find that the first unsuccessful check will cause it to branch. In C, boolean expressions support short-circuiting. When checking multiple conditions, the program will stop evaluating these booleans as soon as it can tell what the final value will be. In this case, unless all four of the equality checks are true, the result will be false. Thus, as soon as the program finds a single false condition, it's done.

The assembly code confirms this short-circuiting operation. Each of the four assembly blocks include a comparison and a conditional branch. All four of the branches return the program to the same location (the start of the while(1) loop). If any of the comparisons are false, the branches will return the program back to the start of the loop. All four branches must fail to get into the body of the if block.

The short-circuiting conditions are perfect for us. We can use our power traces to watch how long it takes for the signature check to fail. If the check takes longer than usual, then we know that the first byte of our signature was right.

In [463]:
from tqdm import tqdm
from chipwhisperer.capture.acq_patterns.basic import AcqKeyTextPattern_Basic
import numpy as np
import time
traces = []
keys = []
plaintexts = []

iv = [0xC1, 0x25, 0x68, 0xDF, 0xE7, 0xD3, 0x19, 0xDA, 0x10, 0xE2, 0x41, 0x71, 0x33, 0xB0, 0xEB, 0x3C]

knownkey = [0x94, 0x28, 0x5D, 0x4D, 0x6D, 0xCF, 0xEC, 0x08, 0xD8, 0xAC, 0xDD, 0xF6, 0xBE, 0x25, 0xA4, 0x99,
            0xC4, 0xD9, 0xD0, 0x1E, 0xC3, 0x40, 0x7E, 0xD7, 0xD5, 0x28, 0xD4, 0x09, 0xE9, 0xF0, 0x88, 0xA1]


knownkey = bytes(knownkey)
aes = AES.new(knownkey, AES.MODE_ECB)

key, text = ktp.newPair()  # manual creation of a key, text pair can be substituted here
N = 100 # Number of traces
sig_start = 0x00
sig_start2 = 0xEB

reset_target(scope)
okay=0
scope.adc.basic_mode = "falling_edge"
while not okay:
    target.ser.write("\0xxxxxxxxxxxxxxxxxx")
    time.sleep(0.005)
    num_char = target.ser.inWaiting()
    response = target.ser.read(num_char)
    if response:
        if ord(response[0]) == 0xA1:
            okay = 1

attack_spot = 0

scope.adc.samples = 24000
scope.adc.offset = 0
target.init()
for i in tqdm(range(N), desc='Capturing traces'):
    #reset_target(scope)
    message = [0x00]
    

    num_char = target.ser.inWaiting()
    target.ser.read(num_char)
    
    key, text = ktp.newPair()  # manual creation of a key, text pair can be substituted here
    
    if not attack_spot:
        text = [0] * 16
        #text[1] = 0xEB
        #text[2] = 0x02
        #text[3] = 0x1D
        attack_spot += 1
        
    #text[0] = 0
    text2 = [0] * 16
    text2[:] = text[:]
    textcpy = [0] * 16
    textcpy[:] = text[:]
    
    #tmp = text[attack_spot]
    #text[attack_spot] = sig_start
    #sig_start = tmp
    
    #tmp = text[attack_spot+1]
    #text[attack_spot+1] = sig_start2
    #sig_start2 = tmp
    
    plaintexts.append(textcpy)
    
    for i in range(len(iv)):
        text[i] ^= iv[i]
        
    
    for i in range(16):
        print("{:02X}".format(text[i]), end="")
    
    print("")
    
    ct = aes.encrypt(bytes(text))
    pt = bytearray(aes.decrypt(ct))
    for i in range(16):
        pt[i] ^= iv[i]
        
    for i in range(16):
        print("{:02X}".format(pt[i]), end="")
    
    print("")
    message.extend(ct)
    iv[:] = ct[:]
    
    crc = bl_crc.bit_by_bit(ct)
    message.append(crc >> 8)
    message.append(crc & 0xFF)

    # run aux stuff that should run before the scope arms here
    
    #flush target's serial
    okay = 0

    scope.arm()

    # run aux stuff that should run after the scope arms here

    target.ser.write(message)
    timeout = 50
    # wait for target to finish
    while target.isDone() is False and timeout:
        timeout -= 1
        time.sleep(0.01)

    try:
        ret = scope.capture()
        if ret:
            print('Timeout happened during acquisition')
            continue
    except IOError as e:
        print('IOError: %s' % str(e))

    # run aux stuff that should happen after trace here
    num_char = target.ser.inWaiting()
    response = target.ser.read(num_char)
    if ord(response[0]) != 0xA4:
        # Bad response, just skip
        print("Bad response: {:02X}".format(ord(response[0])))
        continue
    #if "x" in response:
        #print("Got past sig")
    
    #print(response)
    #print(scope.adc.trig_count)
    
    traces.append(scope.getLastTrace())



Capturing traces:   0%|          | 0/100 [00:00<?, ?it/s][A[A

C12568DFE7D319DA10E2417133B0EB3C
00000000000000000000000000000000




Capturing traces:   1%|          | 1/100 [00:00<00:21,  4.65it/s][A[A

FCBA91B126B01F4AE82EEB9787044525
2B591786C4329BC178FD9FDC5CF2EECB




Capturing traces:   2%|▏         | 2/100 [00:00<00:20,  4.67it/s][A[A

Capturing traces:   3%|▎         | 3/100 [00:00<00:19,  4.91it/s][A[A

83083F4DFC859A161D723D0B70087A81
A1036F71E2C270CAD43330241C89EC54
351456452DA502996C3418362EB531CD
D0A7E3C87FA797B187935DED045B3905




Capturing traces:   4%|▍         | 4/100 [00:00<00:19,  4.91it/s][A[A

Capturing traces:   5%|▌         | 5/100 [00:01<00:19,  4.91it/s][A[A

DD40D594C8ABA5BC67FFAC380B761F89
662F37BDB0A26C302AD42C0D42AF145F






3549DDA8A640B61533FA7DEE32748C4A
14708D51CFFA796BA6B84607AC925A63


Capturing traces:   6%|▌         | 6/100 [00:01<00:19,  4.76it/s][A[A



E561B7163C2AFDA79091EDFE28049344
2D4C8EF236DD148A5165BB03B9FF8000


Capturing traces:   7%|▋         | 7/100 [00:01<00:20,  4.65it/s][A[A

Capturing traces:   8%|▊         | 8/100 [00:01<00:19,  4.72it/s][A[A

9255719E9D14E4CFD570DE2B145F36D8
AC2FF9F84D412DC19D2CF14C69341686




Capturing traces:   9%|▉         | 9/100 [00:01<00:17,  5.13it/s][A[A

C6A1DD3E1C92ABD0499061A949F9CD08
8549AAECC60B7C19662E0DDCA9BE5CA3
563FCE1DBB3B3DBA20E59FFBC14BDA05
D1FD3C5F0076782887848C6A3BB30A44




Capturing traces:  10%|█         | 10/100 [00:02<00:19,  4.58it/s][A[A

3EDC3EF8E636DE51E47D117985280011
E2CEE19222828CE019EB6ED9D725F728




Capturing traces:  11%|█         | 11/100 [00:02<00:24,  3.63it/s][A[A

Capturing traces:  12%|█▏        | 12/100 [00:02<00:22,  3.90it/s]

28CCA51689074876D13AC33F31EE94FE
EDC60370654894064FB72A1183638A28


[A[A

Capturing traces:  13%|█▎        | 13/100 [00:02<00:20,  4.16it/s][A[A

F3898C18FF0AFBD16551B9A9B4E1552C
88051514392567DDF315ADCBFC5A485A
D4CAB493AF04019E87B707D9E5176DA4
3DB968185BF45C26FCE644693F7E3F25




Capturing traces:  14%|█▍        | 14/100 [00:03<00:19,  4.35it/s][A[A

DB0AA7D870E23195ADF3794FAA9B976B
D19231B754F769BE56AF2C112E81526D




Capturing traces:  15%|█▌        | 15/100 [00:03<00:19,  4.43it/s][A[A

Capturing traces:  16%|█▌        | 16/100 [00:03<00:18,  4.60it/s][A[A

B3369FF80F6FA1A2E2893D1778F48950
F11774242A1F11529243971BA92DDA51




Capturing traces:  17%|█▋        | 17/100 [00:03<00:17,  4.64it/s][A[A

90EA2BE8A5CD42AFBB7FD5E00A3F53B2
B229F29DCF391F959CCCDE15BDC8F92D
538817A10E26B1DF02E5D9DEF4A3E443
8C49AF691DBB046AE245EC506E8EFCC4




Capturing traces:  18%|█▊        | 18/100 [00:04<00:21,  3.86it/s][A[A

CABE1D5F28F38FA83BFD129B7D7338DD
D4DD8A008F071C0C835E192E8773C7B4




Capturing traces:  19%|█▉        | 19/100 [00:04<00:24,  3.27it/s][A[A

Capturing traces:  20%|██        | 20/100 [00:04<00:22,  3.62it/s][A[A

A74C74403D2426BDDC682DBA5DE8E65A
C233295503BED2031DFEBB36DE1A666E
F7B608E1FB71A4056399F55734BE15E1
4AC0431A65BD7C67205D6641824E1875




Capturing traces:  21%|██        | 21/100 [00:05<00:29,  2.64it/s][A[A



574CEFF7885DF3F347D2AD5E192EABC7
7B103E6C677C93FDE97D952FFAD801BA


Capturing traces:  22%|██▏       | 22/100 [00:05<00:25,  3.04it/s][A[A

45D3E61E3BFCA7D282937FA26C8DA411
8C3677D54950B91BAF0D66BE5E4AA297




Capturing traces:  23%|██▎       | 23/100 [00:05<00:22,  3.36it/s][A[A

97CEC4B39DAFA68EDA573AE90FD1F294
7B38195C0201C19D4B6513B89C5449ED




Capturing traces:  24%|██▍       | 24/100 [00:06<00:20,  3.63it/s][A[A

Capturing traces:  25%|██▌       | 25/100 [00:06<00:17,  4.26it/s][A[A

CB1991C050B0AE647AD529E6FFCB3342
0DDA60917FDC9251A71094883676F1D9
451753D3D66FA2D636ADA35C8678D2D8
A6092247E0ABCC4D978D3D84F9C2D38D




Capturing traces:  26%|██▌       | 26/100 [00:06<00:18,  3.90it/s][A[A

F10E9DAB0A3F62E7B8C821481E0E4118
07162D1D5035651CEBC5DFD543ADEB10




Capturing traces:  27%|██▋       | 27/100 [00:06<00:17,  4.06it/s][A[A

Capturing traces:  28%|██▊       | 28/100 [00:06<00:17,  4.20it/s][A[A

9501D9F341B65B445B6C537FB0CAE729
DF4B2A4663A275C736B07350F105CF47
0D3A06DC562C90F1820E127EEB4050F1
AB53841F482D60D0FBC19C81017B2381




Capturing traces:  29%|██▉       | 29/100 [00:07<00:20,  3.45it/s][A[A

85B8D788B5B8547CD503A65DFFB49FE4
8AB26292BA481E97A8342FB9346CA5F7




Capturing traces:  30%|███       | 30/100 [00:07<00:20,  3.35it/s][A[A

9DD90F527BCC45920CA0A5CAC87D260E
B76548FAE46309C591586997E91792E4




Capturing traces:  31%|███       | 31/100 [00:08<00:22,  3.01it/s][A[A

46AE89113A08264035ACE530C21654A8
98AE5ACE6D03F7C0E9BC2FC50779A6A8




Capturing traces:  32%|███▏      | 32/100 [00:08<00:20,  3.38it/s][A[A

D376B38E9983013E023E267CDFA6F0A7
F1740B5FAE452838860E225EE8E25197




Capturing traces:  33%|███▎      | 33/100 [00:08<00:20,  3.28it/s][A[A

BBD82C20267358079A61F08D9A75E283
80747999072659918385D2DB2D6365CE




Capturing traces:  34%|███▍      | 34/100 [00:08<00:20,  3.23it/s][A[A

02184C35BBA725F69C89F2605507AAD6
F45CCCE1DF131AA86612228ADAB8DB4A




Capturing traces:  35%|███▌      | 35/100 [00:09<00:21,  3.06it/s][A[A

47242E1A501AB857A4E82C3F271D52EC
C796992DB6FB18DAEB11FA6649A0D15D




Capturing traces:  36%|███▌      | 36/100 [00:09<00:19,  3.35it/s][A[A

A214D2BA83A014273BCB4210FEA7CA05
EFA7050B02DFCD4BBE1B6F2DAEEDF7E1




Capturing traces:  37%|███▋      | 37/100 [00:09<00:17,  3.63it/s][A[A

Capturing traces:  38%|███▊      | 38/100 [00:09<00:14,  4.15it/s][A[A

F9C0DCFBA4AC4E0A8584321B12220BC9
2D8B95914DCB07F639734112708F25DC
634A868E602F471B22ABCDD21F2B0CEB
4FA80A4CEC7D192514D9160EA6B5200F




Capturing traces:  39%|███▉      | 39/100 [00:10<00:14,  4.28it/s][A[A

204DFA9CD08030BE60825ED485D01F50
2CA0135E9F15D6532106F6F07BA16CA1




Capturing traces:  40%|████      | 40/100 [00:10<00:17,  3.47it/s][A[A

Capturing traces:  41%|████      | 41/100 [00:10<00:15,  3.82it/s]

FA0A62C97F1473C666F1F6B6B705AA72
93598BD88016B54D3FEE9D04113BED41


[A[A

E0C217690B10EA8DB1A56A4CED2EAF72
96388E1F47BE25A9EC5BEC3C128F0485




Capturing traces:  42%|████▏     | 42/100 [00:11<00:25,  2.31it/s][A[A

Capturing traces:  43%|████▎     | 43/100 [00:11<00:20,  2.72it/s][A[A

2B74FCF2CDE22421CB00C21ED6DA3720
59B11779D85AE18103215E9374854785
35B04C6AFE95E801C412AF9AAB0C7E0E
E20CCAD12EF42C4A5DCF7CA9DB2DA1E6




Capturing traces:  44%|████▍     | 44/100 [00:12<00:18,  2.98it/s][A[A

Capturing traces:  45%|████▌     | 45/100 [00:12<00:16,  3.40it/s]

6F1F088D59FD3A81483E9F6F1F22F722
DFACD1393E9929F5C080D14CBDE41078


[A[A

E2D8E29CB364B916CF86635DA6F0C9E5
E48B01A0F5A84FA0E7D735BC458BEE3B




Capturing traces:  46%|████▌     | 46/100 [00:12<00:14,  3.67it/s][A[A

E95EE8CDF1721F9AEC70AF2121F035C4
FB84CF1FFDE0F85032F1F5EA3F725B0A




Capturing traces:  47%|████▋     | 47/100 [00:12<00:13,  3.88it/s][A[A

9006B2EFBC4D4088C018A75647FDF215
C06E4DC3C34969F5D7D39DB86CD12CA3




Capturing traces:  48%|████▊     | 48/100 [00:13<00:19,  2.73it/s][A[A

061FD471E50BC0FDA5D97BE06DE8848A
DA493046707742FDB03AF25535279AF6




Capturing traces:  49%|████▉     | 49/100 [00:13<00:17,  2.98it/s][A[A

Capturing traces:  50%|█████     | 50/100 [00:13<00:14,  3.41it/s][A[A

6EA5AEC46CDAFDC98B80828ADDA40A9F
BDA80164DEAF3C795372857CC0696574
FC26C7AE5004FAE3499C8F8742D9B6CC
52467BB1D16FB4710B14010B31E395AA




Capturing traces:  51%|█████     | 51/100 [00:14<00:15,  3.13it/s][A[A

Capturing traces:  52%|█████▏    | 52/100 [00:14<00:13,  3.48it/s]

B9E44CBB8C60A8BB4CC8CB9344746D1E
14A1AF0EC09B80E8B06D3AF934B38536


[A[A

7327ACDE4426247C5EFC18E5DAD8308B
3FAE206EC10DF0D894DF06C7F25A1B4C




Capturing traces:  53%|█████▎    | 53/100 [00:14<00:13,  3.49it/s][A[A

11FF58F29E3C807D91056574F827C261
BD83BBF8AEA11C0A2D4053BE67944F7D




Capturing traces:  54%|█████▍    | 54/100 [00:14<00:12,  3.64it/s][A[A

Capturing traces:  55%|█████▌    | 55/100 [00:15<00:11,  3.94it/s]

70BAD49755799352B8F551FC74894C9F
C5FCDE0AE7436C0D167CAE6375F29778


[A[A

Capturing traces:  56%|█████▌    | 56/100 [00:15<00:10,  4.13it/s][A[A

A0FD44598B5CB33720B1DB6BF34513E4
742D3A147C89D6C719CBFC2DABBB6B35
4668B6AE1ED6187F34FC911CBE01FD11
EA79BC7B73BF02D960D69F0EBAF962BD




Capturing traces:  57%|█████▋    | 57/100 [00:15<00:11,  3.71it/s][A[A



33CB4596A0B807416C0F6AD908E2CA90
65797D11462620A6E48878CCB9EE421A


Capturing traces:  58%|█████▊    | 58/100 [00:15<00:10,  3.95it/s][A[A

Capturing traces:  59%|█████▉    | 59/100 [00:16<00:09,  4.18it/s][A[A

45C2C8830E8AAEB1690C51958D959673
66F0A4B22FBF6F31E4C1D58DDD993BCA
D91A5F5227348B71DCEBC90CE8103971
52BDE354812C979371EED1049AFC6A15




Capturing traces:  60%|██████    | 60/100 [00:16<00:11,  3.42it/s][A[A



5CDC9A6C418173CD87324233B7799373
CB77E0B1FDECDCBC11BD81AFE2AAD585


Capturing traces:  61%|██████    | 61/100 [00:16<00:10,  3.70it/s][A[A

Capturing traces:  62%|██████▏   | 62/100 [00:16<00:08,  4.26it/s][A[A

533D8E850C157E3484BB030C2576B9D7
193806363F70B3E14735DACEDA6FAE67
89BE5882E70A0C3819213A5209000FBB
29DA7EF6CCB6288B19B8743A473B2AE1




Capturing traces:  63%|██████▎   | 63/100 [00:17<00:08,  4.49it/s][A[A



58FC8D03B0AC3FFDC11C08AFB49DC81B
F817420BFE5DFF94A31BB488EA3513C6


Capturing traces:  64%|██████▍   | 64/100 [00:17<00:07,  4.63it/s][A[A

A9F20F77B381B9D98C43F9F75FFE6F9D
50573C51AFB0E9F5FE551D6F939DB65D




Capturing traces:  65%|██████▌   | 65/100 [00:17<00:07,  4.40it/s][A[A

Capturing traces:  66%|██████▌   | 66/100 [00:17<00:07,  4.54it/s][A[A

B0855B3FF9A5D8A1DEBF99D3E0A69C3D
A0CE74A2BCDB9D788DB3D06E9125E66C
877D8D89CAC6CD52FA47A680B0190135
201F4E323AB8D8F1FD59D317A7A019CA




Capturing traces:  67%|██████▋   | 67/100 [00:17<00:07,  4.56it/s][A[A



05484ACB46DA5F36AE0B611EC01FCDC9
6820A81A9350F2B5C131D621E185FF16


Capturing traces:  68%|██████▊   | 68/100 [00:18<00:07,  4.55it/s][A[A

FCBCE4C49E7876FF61286FFAB0374055
B7F2F525A76C8BF62640AE3D85D5C88F




Capturing traces:  69%|██████▉   | 69/100 [00:18<00:07,  4.37it/s][A[A

5F037775788E0A5ED05FA60D4006AE56
CB373793D4B20BE1241B5D7D2059421A




Capturing traces:  70%|███████   | 70/100 [00:18<00:07,  4.09it/s][A[A

707AABE8DADCA2C40091E997004DE96C
1F893A529FAFDEC749F96DD475EF25F2




Capturing traces:  71%|███████   | 71/100 [00:19<00:08,  3.44it/s][A[A

4537683632B4771E9545C6A985553BC9
462556FE1803CE84D410307B4E3133C0




Capturing traces:  72%|███████▏  | 72/100 [00:19<00:09,  3.09it/s][A[A

E45FF686E92067C532B075B3BD3F21B3
FAC7A2777E141A41147237CA81C7C893




Capturing traces:  73%|███████▎  | 73/100 [00:19<00:08,  3.26it/s][A[A

A7E39150E1E30A1483A3AB16B31D57C1
8F0433D12929F3AA976F035BE823F3DB




Capturing traces:  74%|███████▍  | 74/100 [00:20<00:08,  2.94it/s][A[A

D2E40206477FDE2B90A85B72586EFF32
F7863066D29BBF2610718AA95AD4AED4




Capturing traces:  75%|███████▌  | 75/100 [00:20<00:07,  3.22it/s][A[A



3389CB410F7CD6A45BC41A7A74666A7C
A158B471EEA04644D0BD4CA4F3F888F6


Capturing traces:  76%|███████▌  | 76/100 [00:20<00:06,  3.56it/s][A[A

5561F2AF0537EB7DC42C903FD0380037
A0A665488AB4A8AFC042552F7581041C




Capturing traces:  77%|███████▋  | 77/100 [00:20<00:06,  3.40it/s][A[A

D6B6B9FC2B9138A94496FAA214017BD7
9B28672909EE1CED08F4A193FEB114E8




Capturing traces:  78%|███████▊  | 78/100 [00:21<00:06,  3.58it/s][A[A

0C10059AF21D5B9AD39CA5B908D7AE0E
13A5FD4CC2D8F48DCEC34769C9A8DAAD




Capturing traces:  79%|███████▉  | 79/100 [00:21<00:05,  3.78it/s][A[A

Capturing traces:  80%|████████  | 80/100 [00:21<00:04,  4.04it/s]

15F6CE2A43206CA6EF25C2A081FC545C
D0CC7F987F5C54B90B49F882EBD5F113


[A[A

EE2FACE2DB6842D209E06FF12B27D855
CDBE2E493F3DA967B49D9A040D0CE404




Capturing traces:  81%|████████  | 81/100 [00:21<00:04,  3.88it/s][A[A

Capturing traces:  82%|████████▏ | 82/100 [00:22<00:04,  4.15it/s]

5658387F958AD9755C92E3C8257AC75B
864C2574FC10AA615AC57C95152DF8D7


[A[A

Capturing traces:  83%|████████▎ | 83/100 [00:22<00:03,  4.35it/s][A[A

8E967B4A59D0FC07D8BDED640B7E264E
835524784C27AD5657A5DCA84C0EBF74




Capturing traces:  84%|████████▍ | 84/100 [00:22<00:03,  4.43it/s]

96301AFCF3F66AD8EF06F5387903E765
FE2FF482B1204D0D179500A62FF3B987


[A[A



4146D6D987BCFF0261E4849CFF45E846
C28B2CA4ADF5DCAB39ED8E69A5CDD156


Capturing traces:  85%|████████▌ | 85/100 [00:22<00:03,  4.46it/s][A[A

76B67F6873E6A440B4CB805FB387AA99
023362E2A846E4CC833F4A25085548A2




Capturing traces:  86%|████████▌ | 86/100 [00:23<00:03,  3.53it/s][A[A

Capturing traces:  87%|████████▋ | 87/100 [00:23<00:03,  3.90it/s][A[A

703450CDB74EB124AFBD0A3BA28623AB
0732DDDC59A81B9EB0EE28FD4055FE2A
DA03A49452A51DFDD12AFE21A8726286
2BC8830AB91AB11A30E15ABCD8C20506




Capturing traces:  88%|████████▊ | 88/100 [00:23<00:02,  4.14it/s][A[A

Capturing traces:  89%|████████▉ | 89/100 [00:23<00:02,  4.35it/s][A[A

13AE3C22ACB14E1444C5BBF876A095E5
C55B3F8EF61DA76837CCE877BC315C17
83BD7AF8CE369B6CF7EFCB68F60A0B9A
A530AFD74E45434CDD4DC6B64515B777




Capturing traces:  90%|█████████ | 90/100 [00:24<00:02,  3.80it/s][A[A

53B4CAB73356F8DB9C3B183BA423622B
3EECC02DB7675385FE261A2BDF3D7EE5




Capturing traces:  91%|█████████ | 91/100 [00:24<00:02,  3.60it/s][A[A

EB07C0C4C4C6D3589384241A59D5B0A2
97A729A71416259B218705C577EF9846




Capturing traces:  92%|█████████▏| 92/100 [00:24<00:02,  3.78it/s][A[A

D9323587BFE8D54279381D0BD488270A
C0219431D70EF1FCD5B493F95B26AB13




Capturing traces:  93%|█████████▎| 93/100 [00:25<00:02,  3.27it/s][A[A

6CFFC302EFF9ED2535EF92A16D4E34D8
AD6BFAB7831B0EA3AE3C575FEFD4C3E0




Capturing traces:  94%|█████████▍| 94/100 [00:25<00:01,  3.07it/s][A[A

25DE69D4971B5E378D2AC64EB78816CE
133A57CCB0472B82005E37A2DACCD533




Capturing traces:  95%|█████████▌| 95/100 [00:25<00:01,  3.34it/s][A[A

Capturing traces:  96%|█████████▌| 96/100 [00:25<00:01,  3.99it/s][A[A

37F33DFDB406CBD570AED025F02B1708
93665AA51FE326C6C6B728E184A252F6
D7D0B99AE69CA05CEF86E50A70A6C24D
00A6A6D26BE97735F6676EEED137EDAB




Capturing traces:  97%|█████████▋| 97/100 [00:25<00:00,  4.17it/s][A[A



E766DCED5817F8A8597AD1B34BDB5725
22EE80B4AD1E519C68AC028917A73AD4


Capturing traces:  98%|█████████▊| 98/100 [00:26<00:00,  4.34it/s][A[A

1B06C7750D8CA2EA305FBAF1B6C85CD4
A246C03C20681C64FA75D7C9A752728F




Capturing traces:  99%|█████████▉| 99/100 [00:26<00:00,  3.80it/s][A[A

350626B0CD2485B7C6107FBF6C605DD5
27856C4D4EF484C8851937C8798422FE




Capturing traces: 100%|██████████| 100/100 [00:26<00:00,  4.00it/s][A[A

[A[A

### Finding a Single Byte

Okay, we know that our power trace will look a lot different for one of our choices of signatures. Let's figure out which one. We'll start by finding the average over all of our traces:

In [464]:
# Find the average over all of the traces
mean = np.average(traces, axis=0)
#p = figure()


#p.line(range(len(traces[0])), traces[0])
#p.line(range(len(traces[1])), traces[1], line_color="red")
#show(p)

Then, we'll split our traces into 256 different groups (one for each plaintext). Since we know the IV, we can now use it to recover the actual plaintext that the bootloader checks:

In [465]:
# Split the traces into groups
groupedTraces = [[] for _ in range(256)]
#print(plaintexts)
for i in range(N):
    group = plaintexts[i][0]
    #print(group)
    if not traces[i].all():
        groupedTraces[group].append(traces[i])
    else:
        print("Got blank trace {}".format(i))
    

Next, we can find the mean for each group and see how much they differ from the overall mean:

In [466]:
# Find the mean for each group
means = np.zeros([256, 24000])
for i in range(256):
    #print(groupedTraces[i], end="\n\n")
    if len(groupedTraces[i]) > 0:
        means[i] = np.average(groupedTraces[i], axis=0)
        
p = figure()
colors = ["red", "blue", "green", "yellow"]
xrange = range(len(means[0]))
#print(groupedTraces[0])
for i in range(0,10):
    if len(groupedTraces[i]) > 0:
        p.line(xrange, means[i]-mean, line_color=colors[i%4])
    
show(p)

In [410]:
corr = []
for i in range(256):
    corr.append(np.corrcoef(mean[18000:20000], means[i][18000:20000])[0, 1])
print(np.sort(corr))
print(np.argsort(corr))

[0.99910735 0.99916313 0.99926892 0.9992867  0.99928679 0.99929441
 0.99931374 0.99932227 0.99932935 0.99933034 0.99935479 0.99936541
 0.99936583 0.99936674 0.99936791 0.99936929 0.99937529 0.99937624
 0.99938425 0.99938568 0.99939836 0.99940096 0.99940416 0.99940434
 0.99941015 0.99941797 0.99942118 0.99942391 0.99942515 0.99942711
 0.99942725 0.99942965 0.99943058 0.99944477 0.9994455  0.99944652
 0.99944685 0.99945296 0.99945345 0.99945472 0.99946473 0.99946854
 0.99946971 0.9994704  0.99947268 0.99947716 0.99948722 0.99949253
 0.99949565 0.99949709 0.99949817 0.9994987  0.9995093  0.99950964
 0.99951831 0.99951985 0.99952205 0.99952219 0.99952297 0.9995269
 0.99952724 0.99953169 0.99953366 0.99954474 0.9995462  0.99955119
 0.9995624  0.99956565 0.99958732 0.9995994  0.99960577 0.99960835
 0.99970463 0.99972234 0.99972506 0.99974211 0.99974402 0.99975457
 0.99976496 0.99977074 0.99977344 0.99977699 0.99977984 0.99980707
 0.99984072        nan        nan        nan        nan        

  c /= stddev[:, None]
  c /= stddev[None, :]


## Conclusion

We've now successfully recovered the encryption key and the IV for the bootloader!

## Appendix A: IV Attack Script

Make sure you've run the IV analysis blocks before running this block.

In [52]:
for byte in range(16):
    location = 41 + byte * 40
    iv = 0
    for bit in range(8):
        pt_bits = [((dr[i][byte] >> (7-bit)) & 0x01) for i in range(numTraces)]

        # Split traces into 2 groups
        groupedPoints = [[] for _ in range(2)]
        for i in range(numTraces):
            groupedPoints[pt_bits[i]].append(trace_array[i][location])
            
        means = []
        for i in range(2):
            means.append(np.average(groupedPoints[i]))
        diff = means[1] - means[0]
        
        iv_bit = 1 if diff > 0 else 0
        iv = (iv << 1) | iv_bit
        
        print(iv_bit, end = " ")
        
    print("{:02X}".format(iv))

1 1 0 0 0 0 0 1 C1
0 0 1 0 0 1 0 1 25
0 1 1 0 1 0 0 0 68
1 1 0 1 1 1 1 1 DF
1 1 1 0 0 1 1 1 E7
1 1 0 1 0 0 1 1 D3
0 0 0 1 1 0 0 1 19
1 1 0 1 1 0 1 0 DA
0 0 0 1 0 0 0 0 10
1 1 1 0 0 0 1 0 E2
0 1 0 0 0 0 0 1 41
0 1 1 1 0 0 0 1 71
0 0 1 1 0 0 1 1 33
1 0 1 1 0 0 0 0 B0
1 1 1 0 1 0 1 1 EB
0 0 1 1 1 1 0 0 3C
