# Tutorial A5 (Breaking AES-256 Bootloader)

This tutorial will take you through a complete attack on an encrypted bootloader using AES-256. This demonstrates how to use side-channel power analysis on practical systems, along with discussing how to perform analysis with different Analyzer models.

## Background

In the world of microcontrollers, a bootloader is a special piece of firmware that is made to let the user upload new programs into memory. This is especially useful for devices with complex code that may need to be patched or otherwise updated in the future - a bootloader makes it possible for the user to upload a patched version of the firmware onto the micro. The bootloader receives information from a communication line (a USB port, serial port, ethernet port, WiFi connection, etc...) and stores this data into program memory. Once the full firmware has been received, the micro can happily run its updated code.

There is one big security issue to worry about with bootloaders. A company may want to stop their customers from writing their own firmware and uploading it onto the micro. For example, this might be for protection reasons - hackers might be able to access parts of the device that weren't meant to be accessed. One way of stopping this is to add encryption. The company can add their own secret signature to the firmware code and encrypt it with a secret key. Then, the bootloader can decrypt the incoming firmware and confirm that the incoming firmware is correctly signed. Users will not know the secret key or the signature tied to the firmware, so they won't be able to "fake" their own.

This tutorial will work with a simple AES-256 bootloader. The victim will receive data through a serial connection, decrypt the command, and confirm that the included signature is correct. Then, it will only save the code into memory if the signature check succeeded. To make this system more robust against attacks, the bootloader will use cipher-block chaining (CBC mode). Our goal is to find the secret key and the CBC initialization vector so that we could successfully fake our own firmware.

### Bootloader Communications Protocol

The bootloader's communications protocol operates over a serial port at 38400 baud rate. The bootloader is always waiting for new data to be sent in this example; in real life one would typically force the bootloader to enter through a command sequence.

Commands sent to the bootloader look as follows:

```
       |<-------- Encrypted block (16 bytes) ---------->|
       |                                                |
+------+------+------+------+------+------+ .... +------+------+------+
| 0x00 |    Signature (4 Bytes)    |  Data (12 Bytes)   |   CRC-16    |
+------+------+------+------+------+------+ .... +------+------+------+
```

This frame has four parts:

* `0x00`: 1 byte of fixed header
* Signature: A secret 4 byte constant. The bootloader will confirm that this signature is correct after decrypting the frame.
* Data: 12 bytes of the incoming firmware. This system forces us to send the code 12 bytes at a time; more complete bootloaders may allow longer variable-length frames.
* CRC-16: A 16-bit checksum using the CRC-CCITT polynomial (0x1021). The LSB of the CRC is sent first, followed by the MSB. The bootloader will reply over the serial port, describing whether or not this CRC check was valid.

As described in the diagram, the 16 byte block is not sent as plaintext. Instead, it is encrypted using AES-256 in CBC mode. This encryption method will be described in the next section.

The bootloader responds to each command with a single byte indicating if the CRC-16 was OK or not:

```
            +------+
CRC-OK:     | 0xA1 |
            +------+

            +------+
CRC Failed: | 0xA4 |
            +------+
```
Then, after replying to the command, the bootloader veries that the signature is correct. If it matches the expected manufacturer's signature, the 12 bytes of data will be written to flash memory. Otherwise, the data is discarded.

### Details of AES-256 CBC

The system uses the AES algorithm in Cipher Block Chaining (CBC) mode. In general one avoids using encryption 'as-is' (i.e. Electronic Code Book), since it means any piece of plaintext always maps to the same piece of ciphertext. Cipher Block Chaining ensures that if you encrypted the same thing a bunch of times it would always encrypt to a new piece of ciphertext.

You can see another reference on the design of the encryption side; we'll be only talking about the decryption side here. In this case AES-256 CBC mode is used as follows, where the details of the AES-256 Decryption block will be discussed in detail later:

![AES-256](https://wiki.newae.com/images/8/88/Aes256_cbc.png)

This diagram shows that the output of the decryption is no longer used directly as the plaintext. Instead, the output is XORed with a 16 byte mask, which is usually taken from the previous ciphertext. Also, the first decryption block has no previous ciphertext to use, so a secret initialization vector (IV) is used instead. If we are going to decrypt the entire ciphertext (including block 0) or correctly generate our own ciphertext, we'll need to find this IV along with the AES key.

### Attacking AES-256

The system in this tutorial uses AES-256 encryption, which has a 256 bit (32 byte) key - twice as large as the 16 byte key we've attacked in previous tutorials. This means that our regular AES-128 CPA attacks won't quite work. However, extending these attacks to AES-256 is fairly straightforward: the theory is explained in detail in Extending AES-128 Attacks to AES-256.

As the theory page explains, our AES-256 attack will have 4 steps:

1. Perform a standard attack (as in AES-128 decryption) to determine the first 16 bytes of the key, corresponding to the 14th round encryption key.
1. Using the known 14th round key, calculate the hypothetical outputs of each S-Box from the 13th round using the ciphertext processed by the 14th round, and determine the 16 bytes of the 13th round key manipulated by inverse MixColumns.
1. Perform the MixColumns and ShiftRows operation on the hypothetical key determined above, recovering the 13th round key.
1. Using the AES-256 key schedule, reverse the 13th and 14th round keys to determine the original AES-256 encryption key.

## Firmware

For this tutorial, we'll be using the `bootloader-aes256` project, which we'll build as usual:

In [1]:
PLATFORM = "CWLITEARM"
CRYPTO_TARGET="NONE"

In [2]:
%%bash -s "$PLATFORM" "$CRYPTO_TARGET"
cd ../../hardware/victims/firmware/bootloader-aes256
make PLATFORM=$1 CRYPTO_TARGET=$2

rm -f -- bootloader-aes256-CWLITEARM.hex
rm -f -- bootloader-aes256-CWLITEARM.eep
rm -f -- bootloader-aes256-CWLITEARM.cof
rm -f -- bootloader-aes256-CWLITEARM.elf
rm -f -- bootloader-aes256-CWLITEARM.map
rm -f -- bootloader-aes256-CWLITEARM.sym
rm -f -- bootloader-aes256-CWLITEARM.lss
rm -f -- objdir/*.o
rm -f -- objdir/*.lst
rm -f -- bootloader.s aes256.s crcccitt.s simpleserial.s stm32f3_hal.s stm32f3_hal_lowlevel.s stm32f3_sysmem.s
rm -f -- bootloader.d aes256.d crcccitt.d simpleserial.d stm32f3_hal.d stm32f3_hal_lowlevel.d stm32f3_sysmem.d
rm -f -- bootloader.i aes256.i crcccitt.i simpleserial.i stm32f3_hal.i stm32f3_hal_lowlevel.i stm32f3_sysmem.i
.
-------- begin --------
arm-none-eabi-gcc (15:6.3.1+svn253039-1build1) 6.3.1 20170620
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

.
Compiling C: bootloader.c
arm-none-eabi-gcc -c

## Capturing Traces

### Setup

To start, we'll proceed with setup as usual:

In [3]:
%run "Helper Scripts/CWLite_Connect.ipynb"



In [4]:
%run "Helper Scripts/Setup_Target_Generic.ipynb"

In [5]:
# uncomment based on your target
fw_path = "../../hardware/victims/firmware/bootloader-aes256/bootloader-aes256-CWLITEARM.hex"
#%run "Helper Scripts/Program_XMEGA.ipynb"
%run "Helper Scripts/Program_STM.ipynb"
#%run "Helper Scripts/No_Programmer.ipynb"

In [96]:
program_target(scope, fw_path)

Detected known STMF32: STM32F302xB(C)/303xB(C)
Extended erase (0x44), this can take ten seconds or more
Attempting to programming 5875 bytes at 0x8000000
STM32F Programming flash...
STM32F Reading flash...
Verified flash OK, 5875 bytes


### Calculating the CRC

The next step we'll need to take in attacking this target is to communicate with it. Most of the transmission is fairly straight forward, but the CRC is a little tricky. Luckily, there's a lot of open source out there for calculating CRCs. In this case, we'll pull some code from pycrc:

In [6]:
# Class Crc
#############################################################
# These CRC routines are copy-pasted from pycrc, which are:
# Copyright (c) 2006-2013 Thomas Pircher <tehpeh@gmx.net>
#
class Crc(object):
    """
    A base class for CRC routines.
    """

    def __init__(self, width, poly):
        """The Crc constructor.

        The parameters are as follows:
            width
            poly
            reflect_in
            xor_in
            reflect_out
            xor_out
        """
        self.Width = width
        self.Poly = poly


        self.MSB_Mask = 0x1 << (self.Width - 1)
        self.Mask = ((self.MSB_Mask - 1) << 1) | 1

        self.XorIn = 0x0000
        self.XorOut = 0x0000

        self.DirectInit = self.XorIn
        self.NonDirectInit = self.__get_nondirect_init(self.XorIn)
        if self.Width < 8:
            self.CrcShift = 8 - self.Width
        else:
            self.CrcShift = 0

    def __get_nondirect_init(self, init):
        """
        return the non-direct init if the direct algorithm has been selected.
        """
        crc = init
        for i in range(self.Width):
            bit = crc & 0x01
            if bit:
                crc ^= self.Poly
            crc >>= 1
            if bit:
                crc |= self.MSB_Mask
        return crc & self.Mask


    def bit_by_bit(self, in_data):
        """
        Classic simple and slow CRC implementation.  This function iterates bit
        by bit over the augmented input message and returns the calculated CRC
        value at the end.
        """
        # If the input data is a string, convert to bytes.
        if isinstance(in_data, str):
            in_data = [ord(c) for c in in_data]

        register = self.NonDirectInit
        for octet in in_data:
            for i in range(8):
                topbit = register & self.MSB_Mask
                register = ((register << 1) & self.Mask) | ((octet >> (7 - i)) & 0x01)
                if topbit:
                    register ^= self.Poly

        for i in range(self.Width):
            topbit = register & self.MSB_Mask
            register = ((register << 1) & self.Mask)
            if topbit:
                register ^= self.Poly

        return register ^ self.XorOut
    
bl_crc = Crc(width = 16, poly=0x1021)

Now we can easily get the CRC for our message by calling `bl_crc.bit_by_bit(message)`. 

### Communicating with the Bootloader

With that done, we can start communicating with the bootloader. Recall that the bootloader expects:
* To start with `0x00`
* A 16 byte encrypted message (4 bytes signature + 12 bytes data)
* CRC16

We don't really care what the 16 byte message is (just that each is different so that we get a variety of hamming weights), so we'll use the same text/key module from earlier attacks.

We can now run the following block, and we should get `0xA4` back. You may need to run this block a few times to get the right response back.

In [114]:
from chipwhisperer.capture.acq_patterns.basic import AcqKeyTextPattern_Basic
import time
message = [0x00]
fake_message = [0x00, 0x00, 0x11, 0x00, 0x00,0x00, 0x00, 0x11, 0x00, 0x00,0x00, 0x00, 0x11, 0x00, 0x00,0x00, 0x00, 0x11, 0x00]

reset_target(scope)
target.ser.flush()
target.ser.write(fake_message)
num_char = target.ser.inWaiting()
response = target.ser.read(num_char)
if response:
    print("Response: {:02X}".format(ord(response[0])))
    
target.ser.write(fake_message)
num_char = target.ser.inWaiting()
response = target.ser.read(num_char)
if response:
    print("Response: {:02X}".format(ord(response[0])))
time.sleep(0.05)
target.ser.flush()
ktp = AcqKeyTextPattern_Basic(target=target)

# clear serial buffer
num_char = target.ser.inWaiting()
print(target.ser.read(num_char))

key, text = ktp.newPair() #don't care about key here
message.extend(text)

crc = bl_crc.bit_by_bit(text)

message.append(crc >> 8)
message.append(crc & 0xFF)

target.ser.write(message)
print(message)
time.sleep(0.1)

num_char = target.ser.inWaiting()
response = target.ser.read(num_char)
print("Response: {:02X}".format(ord(response[0])))


[0, 245, 154, 58, 67, 131, 81, 112, 172, 224, 200, 33, 32, 205, 95, 144, 135, 146, 213]


IndexError: string index out of range

In [126]:
from chipwhisperer.capture.acq_patterns.basic import AcqKeyTextPattern_Basic
import time
message = [0x00]

reset_target(scope)
time.sleep(0.1)
ktp = AcqKeyTextPattern_Basic(target=target)

# clear serial buffer
num_char = target.ser.inWaiting()
print(target.ser.read(num_char))

key, text = ktp.newPair() #don't care about key here
message.extend(text)

crc = bl_crc.bit_by_bit(text)

message.append(crc >> 8)
message.append(crc & 0xFF)

for i in range(3):
    target.ser.write(message)
    time.sleep(0.1)
    num_char = target.ser.inWaiting()
    response = target.ser.read(num_char)
    #print("Response: {:02X}".format(ord(response[0])))
    print("Response: {}".format(response))

target.ser.write(message)
time.sleep(0.1)
num_char = target.ser.inWaiting()
response = target.ser.read(num_char)
#print("Response: {:02X}".format(ord(response[0])))
print("Response: {}".format(response))

 
Response: 
Response: 
Response: ¡¡
Response: ¤¤


### Capturing Traces

With that out of the way, we can proceed to capturing our traces. The normal 5000 traces we capture isn't long enough to get the rounds we care about, so we'll need to increase it (11000 should be fine):

In [25]:
scope.adc.samples = 2000

We'll be working with Analyzer, so we'll need to use a ChipWhisperer project to store our traces and text:

In [14]:
from chipwhisperer.common.api.ProjectFormat import ProjectFormat
project = ProjectFormat()
project.setFilename("jupyter_test")
tc = project.getTraceFormat()
ktp = AcqKeyTextPattern_Basic(target=target)

In [17]:
help(scope.adc)
scope.adc.basic_mode = "falling_edge"

Help on TriggerSettings in module chipwhisperer.capture.scopes._OpenADCInterface object:

class TriggerSettings(chipwhisperer.common.utils.parameter.Parameterized, chipwhisperer.common.utils.util.DisableNewAttr)
 |  Abstract class that implements basic functionality required by parameterized objects.
 |  All parameterized objects should have _name overriden in the class. The objects can also override this attribute
 |   with different names in the case of having two or more instances. The _description is optional.
 |  
 |  Method resolution order:
 |      TriggerSettings
 |      chipwhisperer.common.utils.parameter.Parameterized
 |      chipwhisperer.common.utils.util.DisableNewAttr
 |      builtins.object
 |  
 |  Methods defined here:
 |  
 |  __init__(self, oaiface)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |  
 |  __repr__(self)
 |      Return repr(self).
 |  
 |  __str__(self)
 |      Return str(self).
 |  
 |  extTriggerPin(self)
 |  
 |  fifoOverflo

In [130]:
#Capture Traces
from tqdm import tqdm
from chipwhisperer.capture.acq_patterns.basic import AcqKeyTextPattern_Basic
import numpy as np
import time
keys = []
plaintexts = []

def reset_target(scope):
    scope.io.nrst = 'low'
    #scope.io.pdic = 'low'
    time.sleep(0.05)
    scope.io.nrst = 'high'
    #scope.io.pdic = 'high'
    
traces = []
N = 1000  # Number of traces
target.init()
for i in tqdm(range(N), desc='Capturing traces'):
    reset_target(scope)
    time.sleep(0.1)
    message = [0x00]
    

    num_char = target.ser.inWaiting()
    target.ser.read(num_char)
    
    key, text = ktp.newPair()  # manual creation of a key, text pair can be substituted here
    keys.append(key)
    plaintexts.append(text)
    
    message.extend(text)
    
    crc = bl_crc.bit_by_bit(text)
    message.append(crc >> 8)
    message.append(crc & 0xFF)

    # run aux stuff that should run before the scope arms here
    
    #starts off broken for some reason
    okay = 0
    while not okay:
        target.ser.write("\0xxxxxxxxxxxxxxxxxx")
        time.sleep(0.005)
        num_char = target.ser.inWaiting()
        response = target.ser.read(num_char)
        if response:
            if ord(response[0]) == 0xA1:
                okay = 1
    

    scope.arm()

    # run aux stuff that should run after the scope arms here

    target.ser.write(message)
    timeout = 50
    # wait for target to finish
    while target.isDone() is False and timeout:
        timeout -= 1
        time.sleep(0.01)

    try:
        ret = scope.capture()
        if ret:
            print('Timeout happened during acquisition')
            continue
    except IOError as e:
        print('IOError: %s' % str(e))

    # run aux stuff that should happen after trace here
    num_char = target.ser.inWaiting()
    response = target.ser.read(num_char)
    if ord(response[0]) != 0xA4:
        # Bad response, just skip
        print("Bad response: {:02X}".format(ord(response[0])))
        continue
    
    traces.append(scope.getLastTrace())




Capturing traces:   0%|          | 0/1000 [00:00<?, ?it/s][A[A[A


Capturing traces:   0%|          | 1/1000 [00:00<15:52,  1.05it/s][A[A[A


Capturing traces:   0%|          | 2/1000 [00:02<16:19,  1.02it/s][A[A[A


Capturing traces:   0%|          | 3/1000 [00:02<14:52,  1.12it/s][A[A[A


Capturing traces:   0%|          | 4/1000 [00:03<13:52,  1.20it/s][A[A[A


Capturing traces:   0%|          | 5/1000 [00:04<12:55,  1.28it/s][A[A[A


Capturing traces:   1%|          | 6/1000 [00:04<12:40,  1.31it/s][A[A[A


Capturing traces:   1%|          | 7/1000 [00:05<11:57,  1.38it/s][A[A[A


Capturing traces:   1%|          | 8/1000 [00:06<11:51,  1.39it/s][A[A[A


Capturing traces:   1%|          | 9/1000 [00:06<11:32,  1.43it/s][A[A[A


Capturing traces:   1%|          | 10/1000 [00:07<11:28,  1.44it/s][A[A[A


Capturing traces:   1%|          | 11/1000 [00:08<12:53,  1.28it/s][A[A[A


Capturing traces:   1%|          | 12/1000 [00:09<12:39,  1.30it/s]

Capturing traces:  10%|█         | 103/1000 [01:17<08:53,  1.68it/s][A[A[A


Capturing traces:  10%|█         | 104/1000 [01:18<09:48,  1.52it/s][A[A[A


Capturing traces:  10%|█         | 105/1000 [01:18<09:17,  1.61it/s][A[A[A


Capturing traces:  11%|█         | 106/1000 [01:19<09:01,  1.65it/s][A[A[A


Capturing traces:  11%|█         | 107/1000 [01:19<08:46,  1.70it/s][A[A[A


Capturing traces:  11%|█         | 108/1000 [01:20<09:21,  1.59it/s][A[A[A


Capturing traces:  11%|█         | 109/1000 [01:21<09:36,  1.54it/s][A[A[A


Capturing traces:  11%|█         | 110/1000 [01:22<11:09,  1.33it/s][A[A[A


Capturing traces:  11%|█         | 111/1000 [01:22<10:14,  1.45it/s][A[A[A


Capturing traces:  11%|█         | 112/1000 [01:23<09:25,  1.57it/s][A[A[A


Capturing traces:  11%|█▏        | 113/1000 [01:24<10:59,  1.34it/s][A[A[A


Capturing traces:  11%|█▏        | 114/1000 [01:25<11:13,  1.32it/s][A[A[A


Capturing traces:  12%|█▏        | 115/1

Capturing traces:  20%|██        | 205/1000 [02:42<09:20,  1.42it/s][A[A[A


Capturing traces:  21%|██        | 206/1000 [02:42<08:51,  1.49it/s][A[A[A


Capturing traces:  21%|██        | 207/1000 [02:43<09:28,  1.40it/s][A[A[A


Capturing traces:  21%|██        | 208/1000 [02:44<08:54,  1.48it/s][A[A[A


Capturing traces:  21%|██        | 209/1000 [02:45<09:38,  1.37it/s][A[A[A


Capturing traces:  21%|██        | 210/1000 [02:45<09:45,  1.35it/s][A[A[A


Capturing traces:  21%|██        | 211/1000 [02:46<09:57,  1.32it/s][A[A[A


Capturing traces:  21%|██        | 212/1000 [02:47<09:43,  1.35it/s][A[A[A


Capturing traces:  21%|██▏       | 213/1000 [02:48<10:03,  1.30it/s][A[A[A


Capturing traces:  21%|██▏       | 214/1000 [02:48<09:27,  1.39it/s][A[A[A


Capturing traces:  22%|██▏       | 215/1000 [02:49<09:30,  1.38it/s][A[A[A


Capturing traces:  22%|██▏       | 216/1000 [02:50<10:22,  1.26it/s][A[A[A


Capturing traces:  22%|██▏       | 217/1

Capturing traces:  31%|███       | 307/1000 [03:53<08:30,  1.36it/s][A[A[A


Capturing traces:  31%|███       | 308/1000 [03:54<08:26,  1.37it/s][A[A[A


Capturing traces:  31%|███       | 309/1000 [03:55<08:41,  1.33it/s][A[A[A


Capturing traces:  31%|███       | 310/1000 [03:55<08:35,  1.34it/s][A[A[A


Capturing traces:  31%|███       | 311/1000 [03:56<08:04,  1.42it/s][A[A[A


Capturing traces:  31%|███       | 312/1000 [03:57<08:08,  1.41it/s][A[A[A


Capturing traces:  31%|███▏      | 313/1000 [03:57<07:46,  1.47it/s][A[A[A


Capturing traces:  31%|███▏      | 314/1000 [03:58<07:27,  1.53it/s][A[A[A


Capturing traces:  32%|███▏      | 315/1000 [03:58<07:23,  1.54it/s][A[A[A


Capturing traces:  32%|███▏      | 316/1000 [03:59<07:46,  1.46it/s][A[A[A


Capturing traces:  32%|███▏      | 317/1000 [04:00<08:00,  1.42it/s][A[A[A


Capturing traces:  32%|███▏      | 318/1000 [04:01<07:42,  1.47it/s][A[A[A


Capturing traces:  32%|███▏      | 319/1

Capturing traces:  41%|████      | 409/1000 [05:13<07:42,  1.28it/s][A[A[A


Capturing traces:  41%|████      | 410/1000 [05:13<07:15,  1.36it/s][A[A[A


Capturing traces:  41%|████      | 411/1000 [05:14<07:31,  1.30it/s][A[A[A


Capturing traces:  41%|████      | 412/1000 [05:15<07:32,  1.30it/s][A[A[A


Capturing traces:  41%|████▏     | 413/1000 [05:16<07:31,  1.30it/s][A[A[A


Capturing traces:  41%|████▏     | 414/1000 [05:16<07:08,  1.37it/s][A[A[A


Capturing traces:  42%|████▏     | 415/1000 [05:17<07:03,  1.38it/s][A[A[A


Capturing traces:  42%|████▏     | 416/1000 [05:18<06:48,  1.43it/s][A[A[A


Capturing traces:  42%|████▏     | 417/1000 [05:18<06:39,  1.46it/s][A[A[A


Capturing traces:  42%|████▏     | 418/1000 [05:19<07:35,  1.28it/s][A[A[A


Capturing traces:  42%|████▏     | 419/1000 [05:20<07:16,  1.33it/s][A[A[A


Capturing traces:  42%|████▏     | 420/1000 [05:21<07:36,  1.27it/s][A[A[A


Capturing traces:  42%|████▏     | 421/1

Capturing traces:  51%|█████     | 511/1000 [06:31<05:35,  1.46it/s][A[A[A


Capturing traces:  51%|█████     | 512/1000 [06:32<05:50,  1.39it/s][A[A[A


Capturing traces:  51%|█████▏    | 513/1000 [06:33<05:34,  1.45it/s][A[A[A


Capturing traces:  51%|█████▏    | 514/1000 [06:33<06:01,  1.35it/s][A[A[A


Capturing traces:  52%|█████▏    | 515/1000 [06:34<05:57,  1.36it/s][A[A[A


Capturing traces:  52%|█████▏    | 516/1000 [06:35<05:54,  1.37it/s][A[A[A


Capturing traces:  52%|█████▏    | 517/1000 [06:36<06:13,  1.29it/s][A[A[A


Capturing traces:  52%|█████▏    | 518/1000 [06:36<05:59,  1.34it/s][A[A[A


Capturing traces:  52%|█████▏    | 519/1000 [06:37<06:01,  1.33it/s][A[A[A


Capturing traces:  52%|█████▏    | 520/1000 [06:38<05:45,  1.39it/s][A[A[A


Capturing traces:  52%|█████▏    | 521/1000 [06:38<05:28,  1.46it/s][A[A[A


Capturing traces:  52%|█████▏    | 522/1000 [06:39<05:14,  1.52it/s][A[A[A


Capturing traces:  52%|█████▏    | 523/1

Capturing traces:  61%|██████▏   | 613/1000 [07:48<04:45,  1.36it/s][A[A[A


Capturing traces:  61%|██████▏   | 614/1000 [07:49<04:44,  1.36it/s][A[A[A


Capturing traces:  62%|██████▏   | 615/1000 [07:50<04:44,  1.35it/s][A[A[A


Capturing traces:  62%|██████▏   | 616/1000 [07:50<04:39,  1.37it/s][A[A[A


Capturing traces:  62%|██████▏   | 617/1000 [07:51<04:53,  1.31it/s][A[A[A


Capturing traces:  62%|██████▏   | 618/1000 [07:52<04:46,  1.33it/s][A[A[A


Capturing traces:  62%|██████▏   | 619/1000 [07:53<04:44,  1.34it/s][A[A[A


Capturing traces:  62%|██████▏   | 620/1000 [07:53<04:27,  1.42it/s][A[A[A


Capturing traces:  62%|██████▏   | 621/1000 [07:54<04:37,  1.37it/s][A[A[A


Capturing traces:  62%|██████▏   | 622/1000 [07:55<04:33,  1.38it/s][A[A[A


Capturing traces:  62%|██████▏   | 623/1000 [07:55<04:29,  1.40it/s][A[A[A


Capturing traces:  62%|██████▏   | 624/1000 [07:56<05:00,  1.25it/s][A[A[A


Capturing traces:  62%|██████▎   | 625/1

Capturing traces:  72%|███████▏  | 715/1000 [09:07<04:13,  1.12it/s][A[A[A


Capturing traces:  72%|███████▏  | 716/1000 [09:08<04:03,  1.17it/s][A[A[A


Capturing traces:  72%|███████▏  | 717/1000 [09:09<03:50,  1.23it/s][A[A[A


Capturing traces:  72%|███████▏  | 718/1000 [09:10<04:22,  1.07it/s][A[A[A


Capturing traces:  72%|███████▏  | 719/1000 [09:11<04:12,  1.11it/s][A[A[A


Capturing traces:  72%|███████▏  | 720/1000 [09:12<04:19,  1.08it/s][A[A[A


Capturing traces:  72%|███████▏  | 721/1000 [09:13<04:04,  1.14it/s][A[A[A


Capturing traces:  72%|███████▏  | 722/1000 [09:14<04:13,  1.10it/s][A[A[A


Capturing traces:  72%|███████▏  | 723/1000 [09:14<03:48,  1.21it/s][A[A[A


Capturing traces:  72%|███████▏  | 724/1000 [09:15<03:32,  1.30it/s][A[A[A


Capturing traces:  72%|███████▎  | 725/1000 [09:16<03:29,  1.31it/s][A[A[A


Capturing traces:  73%|███████▎  | 726/1000 [09:16<03:17,  1.39it/s][A[A[A


Capturing traces:  73%|███████▎  | 727/1

Capturing traces:  82%|████████▏ | 817/1000 [10:31<02:39,  1.15it/s][A[A[A


Capturing traces:  82%|████████▏ | 818/1000 [10:32<02:28,  1.22it/s][A[A[A


Capturing traces:  82%|████████▏ | 819/1000 [10:33<02:22,  1.27it/s][A[A[A


Capturing traces:  82%|████████▏ | 820/1000 [10:34<02:40,  1.12it/s][A[A[A


Capturing traces:  82%|████████▏ | 821/1000 [10:35<02:39,  1.12it/s][A[A[A


Capturing traces:  82%|████████▏ | 822/1000 [10:36<02:44,  1.08it/s][A[A[A


Capturing traces:  82%|████████▏ | 823/1000 [10:37<02:51,  1.03it/s][A[A[A


Capturing traces:  82%|████████▏ | 824/1000 [10:38<02:48,  1.05it/s][A[A[A


Capturing traces:  82%|████████▎ | 825/1000 [10:39<02:41,  1.08it/s][A[A[A


Capturing traces:  83%|████████▎ | 826/1000 [10:40<02:32,  1.14it/s][A[A[A


Capturing traces:  83%|████████▎ | 827/1000 [10:40<02:17,  1.26it/s][A[A[A


Capturing traces:  83%|████████▎ | 828/1000 [10:41<02:23,  1.20it/s][A[A[A


Capturing traces:  83%|████████▎ | 829/1

IOError: [Errno None] 110





Capturing traces:  84%|████████▍ | 840/1000 [10:51<02:15,  1.18it/s][A[A[A


Capturing traces:  84%|████████▍ | 841/1000 [10:52<02:12,  1.20it/s][A[A[A


Capturing traces:  84%|████████▍ | 842/1000 [10:53<02:15,  1.17it/s][A[A[A


Capturing traces:  84%|████████▍ | 843/1000 [10:54<02:08,  1.22it/s][A[A[A


Capturing traces:  84%|████████▍ | 844/1000 [10:55<02:18,  1.13it/s][A[A[A


Capturing traces:  84%|████████▍ | 845/1000 [10:56<02:29,  1.04it/s][A[A[A


Capturing traces:  85%|████████▍ | 846/1000 [10:56<02:19,  1.10it/s][A[A[A


Capturing traces:  85%|████████▍ | 847/1000 [10:58<02:27,  1.04it/s][A[A[A


Capturing traces:  85%|████████▍ | 848/1000 [10:59<02:31,  1.00it/s][A[A[A


Capturing traces:  85%|████████▍ | 849/1000 [11:00<02:28,  1.02it/s][A[A[A


Capturing traces:  85%|████████▌ | 850/1000 [11:00<02:09,  1.16it/s][A[A[A


Capturing traces:  85%|████████▌ | 851/1000 [11:01<02:01,  1.23it/s][A[A[A


Capturing traces:  85%|████████▌ | 85

Capturing traces:  94%|█████████▍| 942/1000 [12:10<00:35,  1.64it/s][A[A[A


Capturing traces:  94%|█████████▍| 943/1000 [12:11<00:42,  1.33it/s][A[A[A


Capturing traces:  94%|█████████▍| 944/1000 [12:12<00:38,  1.45it/s][A[A[A


Capturing traces:  94%|█████████▍| 945/1000 [12:12<00:35,  1.53it/s][A[A[A


Capturing traces:  95%|█████████▍| 946/1000 [12:13<00:32,  1.65it/s][A[A[A


Capturing traces:  95%|█████████▍| 947/1000 [12:13<00:31,  1.69it/s][A[A[A


Capturing traces:  95%|█████████▍| 948/1000 [12:14<00:30,  1.73it/s][A[A[A


Capturing traces:  95%|█████████▍| 949/1000 [12:14<00:29,  1.75it/s][A[A[A


Capturing traces:  95%|█████████▌| 950/1000 [12:15<00:31,  1.57it/s][A[A[A


Capturing traces:  95%|█████████▌| 951/1000 [12:16<00:32,  1.52it/s][A[A[A


Capturing traces:  95%|█████████▌| 952/1000 [12:17<00:34,  1.40it/s][A[A[A


Capturing traces:  95%|█████████▌| 953/1000 [12:18<00:36,  1.28it/s][A[A[A


Capturing traces:  95%|█████████▌| 954/1

In [131]:
from bokeh.plotting import figure, show
from bokeh.io import output_notebook

output_notebook()
p = figure()

xrange = range(len(traces[0]))
p.line(xrange, traces[0], line_color="red")
show(p)

Below you'll find our capture loop. This will be pretty similar to Tutorial B5, but we've added our communication code. We also check the response and just skip the data we get if it isn't correct.

In [132]:
from Crypto.Cipher import AES
import numpy as np

trace_array = np.asarray(traces)  # if you prefer to work with numpy array for number crunching
textin_array = np.asarray(plaintexts)
known_keys = np.asarray(keys)  # for fixed key, these keys are all the same

numTraces = len(trace_array)
traceLen = len(trace_array[0])

In [133]:
knownkey = [0x94, 0x28, 0x5D, 0x4D, 0x6D, 0xCF, 0xEC, 0x08, 0xD8, 0xAC, 0xDD, 0xF6, 0xBE, 0x25, 0xA4, 0x99,
            0xC4, 0xD9, 0xD0, 0x1E, 0xC3, 0x40, 0x7E, 0xD7, 0xD5, 0x28, 0xD4, 0x09, 0xE9, 0xF0, 0x88, 0xA1]

print(len(knownkey))
#knownkey = bytearray(knownkey).decode("latin-1")
#knownkey = knownkey.encode("latin-1")
knownkey = bytes(knownkey)
print(knownkey)
print(len(knownkey))
dr = []
aes = AES.new(knownkey, AES.MODE_ECB)
for i in range(numTraces):
    ct = bytes(textin_array[i])
    pt = aes.decrypt(ct)
    d = [bytearray(pt)[i] for i in range(16)]
    dr.append(d)
print(dr)

32
b'\x94(]Mm\xcf\xec\x08\xd8\xac\xdd\xf6\xbe%\xa4\x99\xc4\xd9\xd0\x1e\xc3@~\xd7\xd5(\xd4\t\xe9\xf0\x88\xa1'
32
[[138, 3, 82, 116, 46, 157, 65, 119, 249, 191, 184, 44, 60, 244, 209, 82], [141, 225, 118, 2, 34, 222, 71, 161, 61, 176, 32, 104, 222, 32, 68, 135], [92, 155, 126, 58, 221, 68, 30, 102, 52, 31, 168, 158, 253, 122, 2, 248], [227, 12, 78, 21, 183, 215, 223, 38, 86, 166, 132, 145, 135, 137, 134, 27], [117, 223, 27, 174, 87, 174, 30, 210, 148, 147, 189, 38, 94, 30, 129, 146], [178, 197, 183, 193, 215, 243, 178, 125, 21, 94, 26, 187, 26, 104, 94, 10], [60, 26, 17, 95, 75, 96, 92, 178, 215, 190, 17, 164, 213, 174, 251, 132], [60, 172, 69, 88, 116, 232, 154, 99, 151, 201, 121, 82, 95, 86, 245, 211], [99, 233, 168, 171, 17, 75, 77, 105, 248, 67, 254, 128, 122, 161, 224, 123], [155, 127, 34, 246, 106, 138, 90, 131, 37, 44, 242, 167, 120, 11, 176, 128], [87, 112, 68, 149, 237, 22, 43, 0, 154, 65, 110, 61, 119, 36, 115, 69], [41, 138, 74, 52, 10, 85, 211, 244, 180, 159, 118, 21, 229, 14

In [147]:
# Split traces into 2 groups
groupedTraces = [[] for _ in range(2)]
for i in range(numTraces):
    bit0 = dr[i][4] & 0x01
    groupedTraces[bit0].append(traces[i])
print(len(groupedTraces[0]))

# Find averages and differences
means = []
for i in range(2):
    means.append(np.average(groupedTraces[i], axis=0))
diff = means[1] - means[0]

p = figure()

xrange = range(len(diff))
xrange2 = range(len(traces[0]))
p.line(xrange, diff, line_color="red")
#p.line(xrange2, traces[0], line_color='blue')
show(p)

495


In [148]:
for byte in range(16):
    location = 41 + byte * 40
    iv = 0
    for bit in range(8):
        pt_bits = [((dr[i][byte] >> (7-bit)) & 0x01) for i in range(numTraces)]

        # Split traces into 2 groups
        groupedPoints = [[] for _ in range(2)]
        for i in range(numTraces):
            groupedPoints[pt_bits[i]].append(traces[i][location])
            
        means = []
        for i in range(2):
            means.append(np.average(groupedPoints[i]))
        diff = means[1] - means[0]
        
        iv_bit = 1 if diff > 0 else 0
        iv = (iv << 1) | iv_bit
        
        print(iv_bit, end = " ")
        
    print("{:02X}".format(iv))

1 1 0 0 0 0 0 1 C1
0 0 1 0 0 1 0 1 25
0 1 1 0 1 0 0 0 68
1 1 0 1 1 1 1 1 DF
1 1 1 0 0 1 1 1 E7
1 1 0 1 0 0 1 1 D3
0 0 0 1 1 0 0 1 19
1 1 0 1 1 0 1 0 DA
0 0 0 1 0 0 0 0 10
1 1 1 0 0 0 1 0 E2
0 1 0 0 0 0 0 1 41
0 1 1 1 0 0 0 1 71
0 0 1 1 0 0 1 1 33
1 0 1 1 0 0 0 0 B0
1 1 1 0 1 0 1 1 EB
0 0 1 1 1 1 0 0 3C


In [None]:
for byte in range(16):
    location

With that, we're done with capturing traces! We can now disconnect from the hardware:

In [None]:
scope.dis()
target.dis()

## Analysis

Now that we have our traces, we can go ahead and perform the attack. As described in the background theory, we'll have to do two attacks - one to get the 14th round key, and another (using the first result) to get the 13th round key. Then, we'll do some post-processing to finally get the 256 bit encryption key.

### 14th Round Key

We can attack the 14th round key with a standard, no-frills CPA attack (using the inverse sbox, since it's a decryption that we're breaking):

In [None]:
import chipwhisperer as cw
from chipwhisperer.analyzer.attacks.cpa import CPA
from chipwhisperer.analyzer.attacks.cpa_algorithms.progressive import CPAProgressive
from chipwhisperer.analyzer.attacks.models.AES128_8bit import AES128_8bit, InvSBox_output

tm = project.traceManager()

attack = CPA()
leak_model = AES128_8bit(InvSBox_output)
attack.setAnalysisAlgorithm(CPAProgressive, leak_model)
attack.setTraceSource(tm)
attack.setTraceStart(0)
attack.setTracesPerAttack(tm.numTraces())
attack.setIterations(1)
attack.setReportingInterval(10)
attack.setTargetSubkeys([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])

With the setup done, we can actually preform the attack. 11000 samples is a rather large amount to chew through, so if you want a faster attack you can use a smaller range in `attack.setPointRange()`. `(2900, 4200)` will work for XMEGA, while `(1400, 2600)` will work for the STM32F3 (CWLite ARM).

In [None]:
attack.setPointRange((0, -1))
stats = attack.processTracesNoGUI()

Below you'll find the key that we should recover from this attack. You may want to check what we actually get against this key to make sure the attack is working.

In [None]:
key = [0xea, 0x79, 0x79, 0x20, 0xc8, 0x71, 0x44, 0x7d, 0x46, 0x62, 0x5f, 0x51, 0x85, 0xc1, 0x3b, 0xcb]
#key = keys[0]

In [None]:
rec_key = []
for bnum in stats.findMaximums():
    print("Best Guess = 0x{:02X} Corr = {}".format(bnum[0][0], bnum[0][2]))
    rec_key.append(bnum[0][0])

### 13th Round Key

Analyzer doesn't have a leakage model for the 13th round key built in, so we'll need to create our own. An example class is shown below along with the beginning of the setup. **NOTE: You'll need to update `calc_round_key` with the key you found in the last step**

In [None]:
import chipwhisperer as cw
from chipwhisperer.analyzer.attacks.cpa import CPA
from chipwhisperer.analyzer.attacks.cpa_algorithms.progressive import CPAProgressive
from chipwhisperer.analyzer.attacks.models.AES128_8bit import AES128_8bit, AESLeakageHelper
from chipwhisperer.analyzer.preprocessing.resync_sad import ResyncSAD

class AES256_Round13_Model(AESLeakageHelper):
    def leakage(self, pt, ct, guess, bnum):
        #You must put YOUR recovered 14th round key here - this example may not be accurate!
        calc_round_key = [0xea, 0x79, 0x79, 0x20, 0xc8, 0x71, 0x44, 0x7d, 0x46, 0x62, 0x5f, 0x51, 0x85, 0xc1, 0x3b, 0xcb]
        xored = [calc_round_key[i] ^ pt[i] for i in range(0, 16)]
        block = xored
        block = self.inv_shiftrows(block)
        block = self.inv_subbytes(block)
        block = self.inv_mixcolumns(block)
        block = self.inv_shiftrows(block)
        result = block
        return self.inv_sbox((result[bnum] ^ guess[bnum]))
    
attack = CPA()
leak_model = AES128_8bit(AES256_Round13_Model)
attack.setAnalysisAlgorithm(CPAProgressive, leak_model)
attack.setTraceSource(tm)

#### Resyncing Traces (XMEGA Only)

The traces for the XMEGA version of the firmware become desynced around sample 7000. This is due to a non-constant AES implementation: the code does not always take the same amount of time to run for every input. (It's actually possible to do a timing attack on this AES implementation! We'll stick with our CPA attack for now.)

While this does open up a timing attack, it actually makes our AES attack a little harder, since we'll have to resync the traces. Luckily, this can be done pretty easily by using the ResyncSAD preprocessing module:

In [None]:
resync_traces = ResyncSAD(tm)
resync_traces.enabled = True
resync_traces.ref_trace = 0
resync_traces.target_window = (9100, 9300)
resync_traces.max_shift = 200
attack.setTraceSource(resync_traces)

#### Running the Attack

Like in the 14th round attack, we can use a smaller range of points to make the attack faster. `(8000,10990)` works well for the XMEGA, while `(6500, 8500)` works well for the STM32F3.

In [None]:
attack.setTraceStart(0)
attack.setTracesPerAttack(tm.numTraces())
attack.setIterations(1)
attack.setReportingInterval(10)
attack.setTargetSubkeys([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15])
attack.setPointRange((0,-1))
stats = attack.processTracesNoGUI()

You can run the block below and the correct key should be printed out:

In [None]:
rec_key2 = []
for bnum in stats.findMaximums():
    print("Best Guess = 0x{:02X}, Corr = {}".format(bnum[0][0], bnum[0][2]))
    rec_key2.append(bnum[0][0])

This, however, isn't actually the 13th round key. To get the real 13th round key, we'll need to run what we've recovered through a `shiftrows()` and `mixcolumns()` operation:

In [None]:
from chipwhisperer.analyzer.attacks.models.aes.funcs import shiftrows,mixcolumns
    
real_key2 = shiftrows(rec_key2)
real_key2 = mixcolumns(real_key2)

print("Recovered:", end="")
for subkey in real_key2:
    print(" {:02X}".format(subkey), end="")
print("")

We now have everything we need to recover the full key! We'll start by combining the 13th and 14th round keys:

In [None]:
rec_key_comb = real_key2.copy()
rec_key_comb.extend(rec_key)

print("Key:", end="")
for subkey in rec_key_comb:
    print(" {:02X}".format(subkey), end="")
print("")

and then we can use the `AES128_8bit` leakage model to recover the first two rounds:

In [None]:
result = leak_model.keyScheduleRounds(rec_key_comb, 13, 0)
result.extend(leak_model.keyScheduleRounds(rec_key_comb, 13, 1))
print("Key:", end="")
for subkey in result:
    print(" {:02X}".format(subkey), end="")
print("")

You should see a 32 byte key printed out. Open `supersecret.h`, confirm that we have the right key, and celebrate!

## Conclusion

We've now successfully recovered the encryption key for the bootloader! You may recall that there's two other secret values we haven't yet recovered: the IV and the signature. In a future (currently unfinished) tutorial, we'll cover how to recover those values as well.