# Hardware Crypto Attack


So far we have mostly been talking about software crypto. But how can we expand this to hardware crypto? Luckily it takes very few changes, so you don't have much to do!

In this lab we'll be looking at what is required to attack a hardware crypto device, and what sort of attacks work on these devices. In this case we're going to "cheat" and use an already recorded power trace, since we don't have hardware crypto on our target boards.

## Capture

In [None]:
%%bash
cd ../../hardware/victims/firmware/
mkdir -p simpleserial-aes-lab1 && cp -r simpleserial-aes/* $_

In [None]:
%%bash
cd ../../hardware/victims/firmware/simpleserial-aes-lab1
make PLATFORM=CW308_STM32F4 CRYPTO_TARGET=HWAES

In [1]:
import chipwhisperer as cw
scope = cw.scope()
target = cw.target(scope)

In [2]:
%run "Helper_Scripts/Setup_Target_Generic.ipynb"


scope.adc.samples = 2000
scope.gain.mode = "low"
scope

cwlite Device
gain = 
    mode = low
    gain = 45
    db   = 22.50390625
adc = 
    state      = False
    basic_mode = rising_edge
    timeout    = 2
    offset     = 0
    presamples = 0
    samples    = 2000
    decimate   = 1
    trig_count = 820007544
clock = 
    adc_src       = clkgen_x4
    adc_phase     = 0
    adc_freq      = 8310447
    adc_rate      = 8310447.0
    adc_locked    = True
    freq_ctr      = 0
    freq_ctr_src  = extclk
    clkgen_src    = system
    extclk_freq   = 10000000
    clkgen_mul    = 2
    clkgen_div    = 26
    clkgen_freq   = 7384615.384615385
    clkgen_locked = True
trigger = 
    triggers = tio4
    module   = basic
io = 
    tio1       = serial_rx
    tio2       = serial_tx
    tio3       = high_z
    tio4       = high_z
    pdid       = high_z
    pdic       = high_z
    nrst       = high_z
    glitch_hp  = False
    glitch_lp  = False
    extclk_src = hs1
    hs2        = clkgen
    target_pwr = True
glitch = 
    clk_src     = target
    w

In [3]:
prog = cw.programmers.STM32FProgrammer
fw_path = "../../hardware/victims/firmware/simpleserial-aes-lab1/simpleserial-aes-CW308_STM32F4.hex"
cw.programTarget(scope, prog, fw_path)

Detected known STMF32: STM32F40xxx/41xxx
Extended erase (0x44), this can take ten seconds or more
Attempting to programming 4399 bytes at 0x8000000
STM32F Programming flash...
STM32F Reading flash...
Verified flash OK, 4399 bytes


In [4]:
project = cw.createProject("stm32f415_lab.cwp", overwrite=True)
tc = project.newSegment()

#Capture Traces
from tqdm import tqdm
import numpy as np
import time

ktp = cw.ktp.Basic(target=target)

N = 5000  # Number of traces
target.init()
for i in tqdm(range(N), desc='Capturing traces'):
    # run aux stuff that should come before trace here

    key, text = ktp.newPair()  # manual creation of a key, text pair can be substituted here

    target.loadEncryptionKey(key)
    target.loadInput(text)

    # run aux stuff that should run before the scope arms here

    scope.arm()

    # run aux stuff that should run after the scope arms here

    target.go()
    timeout = 50
    # wait for target to finish
    while target.isDone() is False and timeout:
        timeout -= 1
        time.sleep(0.01)

    try:
        ret = scope.capture()
        if ret:
            print('Timeout happened during acquisition')
    except IOError as e:
        print('IOError: %s' % str(e))

    # run aux stuff that should happen after trace here
    
    # note you may want:
    # num_char = target.ser.inWaiting()
    # response = target.ser.read(num_char)
    textout = target.readOutput()  # clears the response from the serial port
    #traces.append(scope.getLastTrace())
    tc.addTrace(scope.getLastTrace(), text, textout, key)
    
project.appendSegment(tc)
project.save()

# cleanup the connection to the target and scope
scope.dis()
target.dis()

Capturing traces: 100%|██████████| 5000/5000 [23:02<00:00,  5.71it/s]


## Analysis

Next, we'll add our traces to a preprocessing module. We can feed `project.traceManager()` right into `attack.setTraceSource()`, but we could also add pre-processing inbetween (more about this later). We'll also re-open the traces, in this case it is required since the call to `closeAll()` would have flushed the buffers.

In [6]:
#Force reload of project data (if you comment out 'closeAll()' this isn't needed)

#We also rebuild the project object in case you only want to run this half
project = cw.openProject('./stm32f415_lab.cwp')

tm = project.traceManager()

This time we're going to do a few things. First we will get the traces, and plot a few of them as-is. You can adjust the traces plotted by adjusting the `range(10)`. For example `range(1)` plots the first trace.

In [7]:
from bokeh.plotting import figure, show
from bokeh.io import output_notebook
from bokeh.palettes import Dark2_5 as palette
import itertools  

output_notebook()
p = figure(sizing_mode='scale_width', plot_height=300)

# create a color iterator
colors = itertools.cycle(palette)  

x_range = range(0, tm.numPoints())
for i, color in zip(range(10), colors): #Adjust range(n) to plot certain traces
    p.line(x_range, tm.getTrace(i), color=color)
show(p)

If this all works - let's just continue the attack! We're going to use the same leakge model as previously (Hamming weight), we'll seperate this out since will be changing that model around shortly.

In [None]:
leak_model = cw.AES128(cw.AES128Leakage.SBox_output)

In [13]:
attack = cw.CPA(project.traceManager(), leak_model)
attack.setReportingInterval(50)
attack.setPointRange((1312, 1317))

ERROR:root:Value -1 out of limits ((1, 5000)) in parameter "Traces per Attack"


And then actually run it:

In [14]:
cb = cw.getJupyterCallback(attack)
attack_results = attack.processTracesNoGUI(cb, show_progress_bar=True)

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
PGE=,0,10,0,0,21,0,0,0,1,0,0,0,0,5,0,0
0,D0 0.079,7D 0.055,F9 0.079,A8 0.059,22 0.057,EE 0.065,25 0.062,89 0.059,AD 0.054,3F 0.069,0C 0.063,C8 0.073,B6 0.083,A2 0.063,0C 0.050,A6 0.071
1,2C 0.054,58 0.048,F8 0.054,08 0.053,DF 0.056,F6 0.048,A1 0.045,99 0.048,E1 0.050,DA 0.052,CA 0.048,0E 0.047,C9 0.054,74 0.045,C3 0.048,58 0.052
2,88 0.048,E9 0.047,F0 0.043,FC 0.042,5B 0.055,92 0.047,4B 0.045,14 0.047,F9 0.050,9F 0.046,B0 0.045,57 0.044,52 0.053,D0 0.041,7B 0.046,E0 0.047
3,11 0.048,FD 0.045,21 0.043,E5 0.042,EA 0.050,1B 0.045,C3 0.044,92 0.045,2D 0.050,4F 0.042,B3 0.045,FF 0.043,54 0.049,D7 0.040,80 0.044,4F 0.046
4,95 0.046,AA 0.045,FE 0.040,DB 0.042,ED 0.047,1F 0.042,F9 0.043,7F 0.042,7D 0.048,11 0.041,78 0.040,E5 0.042,0C 0.048,44 0.039,F4 0.043,F1 0.044


Analysis in Progress: 100%|██████████| 409599/409599 [12:37<00:00, 541.02traces/s, Trace Interval: 4950-4999. Current Subkey: 15]


This will almost certainly fail. The leakage model is incorrect, so we need to find the correct (new) leakage model we should be using. This turns out to be pretty easy, since most typical hardware implementations use only one of a few possible models. We'll specifically just try the "Last Round State Over-Written" model first. You can do this by updating the model above to have the following:

In [10]:
leak_model = cw.AES128(cw.AES128Leakage.LastroundStateDiff)

But this might not be enough! You may need to "window" around the area of interest. This is best done by plotting the results, and picking a nice area. For example a window of `attack.setPointRange((1312, 1314))` seems to work well on some traces.

In [71]:
from bokeh.plotting import figure, show
from bokeh.io import output_notebook

output_notebook()
p = figure()

bnum = 11

key = attack.knownKey()
data = attack.getStatistics().diffs[bnum]
xr = range(0, len(data[0]))

for v in range(0, 256):
    p.line(xr, data[v], line_color='green')

p.line(xr, data[key[bnum]], line_color='red')
show(p)

You should see a graph of red and green in time (samples). In red is the correlation of the correct subkey for the first byte, while the rest are in green. You can use this graph to help fine-tune the windowing of the data.

## Conclusion

Attacking hardware crypto is similar to any other DPA style attack. In this example we have concentrated on the standard "Last Round State" to break a real hardware accelerator.