# Breaking Hardware ECC on CW305 FPGA, part 5

This builds on CW305_ECC parts 1, 2, 3 and 4 notebooks; be sure to digest them before starting this one.

In this notebook, we look at what TVLA can tell us about our target's leakage.

The tutorial was developed with a CW-Pro with the 100t FPGA; the observations made in the attack's development should be accurate if you're using the same, but other combinations of CW-Pro / CW-Lite / CW-Husky / 100t / 35t may behave somewhat differently.

## Setup

See CW305_ECC.ipynb for explanations which are not repeated here.

In [None]:
#PLATFORM = 'CWLITE'
PLATFORM = 'CWPRO'
#PLATFORM = 'CWHUSKY'

In [None]:
import chipwhisperer as cw
scope = cw.scope()
target = cw.target(scope, cw.targets.CW305_ECC, fpga_id='100t', force=False) # or fpga_id='35t', as appropriate

In [None]:
%run "CW305_ECC_setup.ipynb"

In [None]:
# ensure ADC is locked:
scope.clock.reset_adc()
assert (scope.clock.adc_locked), "ADC failed to lock"

Occasionally the ADC will fail to lock on the first try; when that happens, the above assertion will fail (and on the CW-Lite, the red LED will be on). Simply re-running the above cell again should fix things.

# TVLA

We begin by conducting a TVLA test on the original core.

Normally, TVLA is conducted on the full target operation. But here our target operation is very long, and TVLA requires lots of traces; given that we already know that except for the leading 1, leakage for each bit of $k$ appears identical, we'll only capture the first few bits.

(Capture the full operation if you wish, but be mindful that it's easy to run into "out of memory" errors. If this happens, reduce the number of traces, or capture and save them to disk in chunks. The TVLA calculation itself will also take a lot more time on the full traces.)

Furthermore, in the interest of time, we'll only capture 2000 traces. Feel free to capture more; 10000 per group is a common number, as per "Security Level 3" of ISO 17825, but it does not make much difference to our results.

In [None]:
change_bitfile('original')

In [None]:
import tvlattest_ecc as TVLA
ktp = TVLA.TVLATTest_ECC(target.curve)

In [None]:
set_adc(samples=int(cycles[9]))

In [None]:
def get_tvla_traces(N=2000, group=2):
    traces = []
    ktp.init(traces=N, groups=[0,group])
    for i in tnrange(N, desc='Capturing traces'):
        k, P, group = ktp.next()
        ret = target.capture_trace(scope, Px=P.x, Py=P.y, k=k)
        if not ret:
            print("Failed capture")
            continue
        ret.textout['group'] = group
        traces.append(ret)
    return traces

By specifying `group=2`, we will capture traces belonging to groups 0 and 2, which are defined in our `TVLATTest_ECC` class.

Group 0 holds $k$ and the base point $P$ fixed for each capture.

Group 2 hold $P$ fixed, and randomizes $k$ for each capture.

The TVLA test looks at statistical differences between these two groups, therefore this will show the effect of varying $k$.

In [None]:
tvla_traces = get_tvla_traces(N=2000, group=2)

In [None]:
def _group_traces(traces, group1, group2, start, stop, stat):
    """Split a set of traces into a and b lists of grouped traces.
    Meant to be called by calc_tvla().
    Args:
        traces: list of traces
        groups (int): number of groups contained in the traces
        start (int): start index for TVLA stats
        stop (int): stop index for TVLA stats; None = to the end
        stat (string): data to return:
            "full": full measured data
            "avgbit": average bit
    Returns:
        grouped_data: 2-element list of lists; 1st list is requested data for group1,
            2nd list is requested data for group2.
    """
    grouped_data = [[], []]
    for trace in traces:
        if stat == 'full':
            data = trace.wave[start:stop]
        elif stat == 'avgbit':
            data = trace.textout['avgbit']
        else:
            raise ValueError()
        if trace.textout['group'] == group1:
            grouped_data[0].append(data)
        elif trace.textout['group'] == group2:
            grouped_data[1].append(data)
            
    return grouped_data


In [None]:
def calc_tvla(traces, group1, group2, start=0, stop=None, stat='full', verbose=True):
    """Calculate TVLA data using power (default) or time measurements.
    Args:
        group1 (int): first group to use in computing TVLA
        group2 (int): second group to use in computing TVLA
        start (int): start index for TVLA stats (N/A when stat='time')
        stop (int): stop index for TVLA stats; None = to the end (N/A when stat='time')
        stat (string): compute TVLA using which data:
            "full": measured power data
            "avgbit": average bit
    Returns:
        ttrace_a, ttrace_b: TVLA data for first and second half of traces
    """
    import scipy.stats
    import time
    start_time = time.time()
    grouped_data_1 = []
    grouped_data_2 = []
    grouped_data_1, grouped_data_2 = _group_traces(traces, group1, group2, start, stop, stat)
    
    if verbose:
        print("Found %d data points for group %d." % (len(grouped_data_1), group1))
        print("Found %d data points for group %d." % (len(grouped_data_2), group2))
    
    # then split each into two halves
    grouped_data_a = [[], []]
    grouped_data_b = [[], []]

    half = len(grouped_data_1)//2
    grouped_data_a[0] = grouped_data_1[half:]
    grouped_data_b[0] = grouped_data_1[:half]
    grouped_data_a[1] = grouped_data_2[half:]
    grouped_data_b[1] = grouped_data_2[:half]
    
    if verbose:
        print("Calculating TVLA... ", end='')
    ttrace_a = scipy.stats.ttest_ind(grouped_data_a[0], grouped_data_a[1], axis=0, equal_var=False)[0]
    if verbose:
        print('group A done... ', end='')
    ttrace_b = scipy.stats.ttest_ind(grouped_data_b[0], grouped_data_b[1], axis=0, equal_var=False)[0]
    if verbose:
        print('group B done.')
    elapsed_time = time.time() - start_time
    if verbose:
        print('Elapsed time: %d seconds.' % elapsed_time)
    return ttrace_a, ttrace_b

In [None]:
tta, ttb = calc_tvla(tvla_traces, 0, 2)

In [None]:
from bokeh.plotting import figure, show
from bokeh.resources import INLINE
from bokeh.io import output_notebook
from bokeh.models import Span, Legend, LegendItem

output_notebook(INLINE)

s = figure(plot_width=1300, plot_height=600, x_axis_label='clock cycle')

xrange = range(len(tta))

T = s.line(xrange, (tta), line_color='red')
T = s.line(xrange, (ttb), line_color='blue')

for c in cycles[:9]:
    s.renderers.extend([Span(location=c, dimension='height', line_color='black', line_width=1, line_dash='dashed')])

for p in [6, 4202]:
    for c in cycles[:9]:
        s.renderers.extend([Span(location=c+p, dimension='height', line_color='red', line_width=1, line_dash='dashed')])

s.renderers.extend([Span(location=-4.5, dimension='width', line_color='green', line_width=1, line_dash='dotted')])
s.renderers.extend([Span(location=+4.5, dimension='width', line_color='green', line_width=1, line_dash='dotted')])

In [None]:
show(s)

The raw TVLA plot above is annotated with black dashed lines which indicate the start time of each bit, and red dashed lines which indicate the DoM markers of our attack.

Finally, the green dotted lines indicate the TVLA pass/fail thresholds.

We make the following observations:
1. There are no failures during the first bit, as expected.
2. The first failures coincides *exactly* with our cycles 6-7 DoM markers.
3. Failures also coincide with our cycles 4202-4203 markers.
4. After the first bit, numerous large failures abound throughout the subsequent bits.

Let's quantify the last point:

In [None]:
threshold = 4.5
for b in range(9):
    fails = 0
    for cycle in range(4204):
        if (abs(tta[cycles[b] + cycle]) > 4.5) and (abs(ttb[cycles[b] + cycle]) > 4.5):
            fails += 1
    print('Bit %d: %4d failures (%3d percent of samples)' % (b, fails, int(fails/4204*100)))

We find that a **huge** number of samples are failing for bits 1 onwards; around 60% for bit 1, then diminishing on subsequent bits to stabilize around 25%.

So while our DoM markers are present in the TVLA failure set, they appear to be very much lost in the noise.

To visualize that, let's overlay the TVLA results with our original "average of zeros vs ones" plot which was used to identify the markers for our attack:

In [None]:
k = 0xffffffffffffffffffffffffffffffff00000000000000000000000000000000
avg_trace = get_traces(1, k)

In [None]:
samples = 4204
trace = avg_trace[0]
avg_ones = np.zeros(samples)
for start in cycles[1:128]:
    avg_ones += trace.wave[start:start+samples]
avg_ones /= 128

avg_zeros = np.zeros(samples)
for start in cycles[128:256]:
    avg_zeros += trace.wave[start:start+samples]
avg_zeros /= 128

In [None]:
from bokeh.plotting import figure, show
from bokeh.resources import INLINE
from bokeh.io import output_notebook
from bokeh.models import Span, Legend, LegendItem

output_notebook(INLINE)

s = figure(plot_width=1300, plot_height=600, x_axis_label='clock cycle')

xrange = range(len(avg_zeros))
ratio = max(abs(tta)) / max(abs(avg_ones-avg_zeros))

T = s.line(xrange, (abs(tta[cycles[7]:cycles[8]])), line_color='red')
#T = s.line(xrange, (ttb[cycles[7]:cycles[8]]), line_color='blue')

A = s.line(xrange, ratio*abs(avg_ones-avg_zeros), line_color='blue')

# add legend:
legend = Legend(items=[
    LegendItem(label='D (scaled)', renderers=[A]),
    LegendItem(label='TVLA result', renderers=[T]),
])
s.add_layout(legend)
s.legend.label_text_font_size='16pt'

s.xaxis.axis_label_text_font_size = '20pt'
s.yaxis.axis_label_text_font_size = '20pt'
s.xaxis.major_label_text_font_size = '14pt'
s.yaxis.major_label_text_font_size = '14pt'
s.title.text_font_size = '20pt'

In [None]:
show(s)

Now, let's visualize what happens if we collect our attack markers from the TVLA failures (instead of from the DoM results):

In [None]:
k = 0x0000ffffffffff000000000000ffff00aaaa0000cccc00001111000033330000
traces = get_traces(20, k)

In [None]:
dom_poi = [4202, -4203, -6, 7]

In [None]:
def update_corrected_plot(no_traces, tvla_threshold):
    SSC.data_source.data['y'] = get_sums(traces[:no_traces], dom_poi)
    
    tvla_poi = list(np.where(abs(tta[cycles[7]:cycles[8]]) > tvla_threshold)[0])
    SSCtvla.data_source.data['y'] = get_sums(traces[:no_traces], tvla_poi)

    push_notebook()

In [None]:
from ipywidgets import interact, Layout
from bokeh.io import push_notebook, output_notebook

SC = figure(plot_width=1200, x_axis_label='k bit index', y_axis_label='D')

xrange = range(len(cycles)-1)
dom_sums = get_sums(traces, dom_poi)
SSC = SC.line(xrange, dom_sums, line_color='blue')

tvla_threshold = 20
tvla_poi = list(np.where(abs(tta[cycles[7]:cycles[8]]) > tvla_threshold)[0])
tvla_sums = get_sums(traces, tvla_poi)
SSCtvla = SC.line(xrange, tvla_sums, line_color='red')

SC.xaxis.axis_label_text_font_size = '20pt'
SC.yaxis.axis_label_text_font_size = '20pt'
SC.xaxis.major_label_text_font_size = '14pt'
SC.yaxis.major_label_text_font_size = '14pt'
SC.title.text_font_size = '20pt'

In [None]:
show(SC, notebook_handle=True)

In [None]:
interact(update_corrected_plot, no_traces=(1, len(traces)), tvla_threshold=(4, 40))

As you play with the `no_traces` and `tvla_threshold` knobs, it should become apparent that using markers extracted from TVLA failures is an **excellent** distinguisher for the leading 1 of $k$, but nothing else.

This makes sense: we can infer that the large number of TVLA failures point to the leakage caused by the leading 1 (and our work on attempt #4 in part 4 of this series supports that).

(This also explains why the number of TVLA failures is highest for bit 1 and then decreases, as the leading 1 becomes less and less likely to occur for each subsequent bit.)

To wrap up, let's repeat this exercise with the bitfile from attempt #4:

# Attempt #4 revisited

In [None]:
change_bitfile('attempt4')

In [None]:
set_adc(samples=int(cycles[9]))

In [None]:
tvla_traces4 = get_tvla_traces(N=2000)

In [None]:
tta4, ttb4 = calc_tvla(tvla_traces4, 0, 2)

In [None]:
from bokeh.plotting import figure, show
from bokeh.resources import INLINE
from bokeh.io import output_notebook
from bokeh.models import Span, Legend, LegendItem

output_notebook(INLINE)

s4 = figure(plot_width=1300, plot_height=600, x_axis_label='clock cycle')

xrange = range(len(tta4))

T4 = s4.line(xrange, (tta4), line_color='red')
T4 = s4.line(xrange, (ttb4), line_color='blue')

for c in cycles[:9]:
    s4.renderers.extend([Span(location=c, dimension='height', line_color='black', line_width=1, line_dash='dashed')])

for p in [6, 4202]:
    for c in cycles[:9]:
        s4.renderers.extend([Span(location=c+p, dimension='height', line_color='red', line_width=1, line_dash='dashed')])

s4.renderers.extend([Span(location=-4.5, dimension='width', line_color='green', line_width=1, line_dash='dotted')])
s4.renderers.extend([Span(location=+4.5, dimension='width', line_color='green', line_width=1, line_dash='dotted')])

In [None]:
show(s4)

In [None]:
threshold = 4.5
for b in range(9):
    fails = 0
    for cycle in range(4204):
        if (abs(tta4[cycles[b] + cycle]) > 4.5) and (abs(ttb4[cycles[b] + cycle]) > 4.5):
            fails += 1
    print('Bit %d: %4d failures (%3d percent of samples)' % (b, fails, int(fails/4204*100)))

So far, this looks the same as it did with the original bitfile.

In [None]:
k = 0x0000ffffffffff000000000000ffff00aaaa0000cccc00001111000033330000
traces4 = get_traces(20, k)

In [None]:
dom_poi4 = [4201, -4202, -6, 7]

In [None]:
def update_corrected_plot4(no_traces, tvla_threshold):
    SSC4.data_source.data['y'] = get_sums(traces4[:no_traces], dom_poi4)
    
    tvla_poi = list(np.where(abs(tta[cycles[7]:cycles[8]]) > tvla_threshold)[0])
    SSCtvla4.data_source.data['y'] = get_sums(traces4[:no_traces], tvla_poi)

    push_notebook()

In [None]:
from ipywidgets import interact, Layout
from bokeh.io import push_notebook, output_notebook

SC4 = figure(plot_width=1200, x_axis_label='k bit index', y_axis_label='D')

xrange = range(len(cycles)-1)
dom_sums4 = get_sums(traces4, dom_poi4)
SSC4 = SC4.line(xrange, dom_sums4, line_color='blue')

tvla_threshold4 = 20
tvla_poi4 = list(np.where(abs(tta4[cycles[7]:cycles[8]]) > tvla_threshold4)[0])
tvla_sums4 = get_sums(traces4, tvla_poi4)
SSCtvla4 = SC4.line(xrange, tvla_sums4, line_color='red')

SC4.xaxis.axis_label_text_font_size = '20pt'
SC4.yaxis.axis_label_text_font_size = '20pt'
SC4.xaxis.major_label_text_font_size = '14pt'
SC4.yaxis.major_label_text_font_size = '14pt'
SC4.title.text_font_size = '20pt'

In [None]:
show(SC4, notebook_handle=True)

In [None]:
interact(update_corrected_plot4, no_traces=(1, len(traces4)), tvla_threshold=(4, 40))

As you may have expected, the results are similar to what we saw with the original bitfile: using markers extracted from TVLA failures is an **excellent** distinguisher for the leading 1 and leading 0 of $k$, but nothing else.

Now, you may think that TVLA could be used to reveal the specific markers for bits that come after the leading 1; what if we define our test group (i.e. group 2, defined at the start of this notebook) to have a randomized $k$ where the leading 1 is always at the same position?

Try it! Re-run with the group set to 15. You'll see that this modified TVLA test gives us essentially the very same markers as the original TVLA test, except that it is now the *second* leading one that is leaked. So, we are no further ahead. The reason for this should be obvious after some thought.

This highlights the limitations of TVLA in the context of our particular target and attack.

# Conclusion

This 5-part series of demos has covered a lot of ground for hardware-based ECC attacks and defenses. Hopefully you found it useful!