# Segmented Buffer Example

The segmented buffer splits the ChipWhisperer-Lite/Pro sample buffer (24K/100K) into little segments. When you are doing a capture of small traces (such as with Hardware AES), you end up with a case where our buffer looks like this:

                [ ~~~~~~................................................... ]
                     |                      |
    One power trace -+        Empty Space --+
   
We need to have the overhead of unload the buffer for that little bit of data. With segmented capture, we instead do this:

                [ ~~~~~~|~~~~~~|~~~~~~|~~~~~~|~~~~~~|~~~~~~|~~~~~~|~~~~~~|~ ]
                    |       |              \ /                            |
    Power trace  1 -+       |               |             Partial trace---+
    Power trace 2 ----------+               |
    Power traces 3..4..etc -----------------+
    
We'll show an example of this with the STM32F415 device.

This will require a special operating mode that re-runs encryptions a number of times, so we don't have the serial protocol overhead (which would slow our capture down). You may want to modify the firmware to have less overhead in calling the hardware AES core as well.

This segmented run mode was added in recent `simpleserial-aes` firmwares, you can see the commands added:

    simpleserial_addcmd('s', 2, enc_multi_setnum);
    simpleserial_addcmd('f', 16, enc_multi_getpt);

The multiencrypt just runs the encryption with the new input being the old output:

    for(unsigned int i = 0; i < num_encryption_rounds; i++){
        trigger_high();
        aes_indep_enc(pt);
        trigger_low();
    }

## STM32F415 Hardware Programming / Setup

This example uses the STM32F415 because it supports hardware AES. The segmented capture only provides serious speed-ups when the encryption itself is already very fast. The default software AES won't provide as dramatic a speed-up.

Note you can use the "simpleserial" protocol V2 (SS_V2), which uses binary communications. This will further improve the speed-up by giving you less overhead for loading/unloading data.


In [None]:
SCOPETYPE = 'OPENADC'
PLATFORM = 'CW308_STM32F4'
CRYPTO_TARGET = 'HWAES'
num_traces = 50
CHECK_CORR = False

In [None]:
%%bash -s "$PLATFORM" "$CRYPTO_TARGET"
cd ../../hardware/victims/firmware/simpleserial-aes
make PLATFORM=$1 CRYPTO_TARGET=$2

In [None]:
%run "../Setup_Scripts/Setup_Generic.ipynb"

In [None]:
fw_path = '../../hardware/victims/firmware/simpleserial-aes/simpleserial-aes-{}.hex'.format(PLATFORM)

In [None]:
cw.program_target(scope, prog, fw_path)

## FPGA Version Note

An updated bitstream was previously required for this - if you are on latest (from git) ChipWhisperer, this will be loaded automatically. If you wish to test other bitstream configurations, note you can load it with:

    scope.reload_fpga("cwlite_interface.bit")
    %run "../Setup_Scripts/Setup_Generic.ipynb"

But this is **not required to continue**.

## Warning on sample size

There is some bug with the sample size - you must set `scope.adc.samples` above some threshold (appears to be around 240). If you set it too low you'll get an error about insufficient data returned - cause is under investigation still. 

## Capture Example - Multiencryption Command

The following will capture the first 1200 points of a trace, and record it. We'll use the multiencryption commands to improve the speed which we cause encryptions to occur.

In [None]:
scope.adc.samples = 1200

In [None]:
scope.adc.fifo_fill_mode = "segment"

In [None]:
import struct
import time

#IMPORTANT - we now need to generate enough triggers such that scope.adc.samples * NUM_TRIGGERS > max_fifo_size
#            If not the HW won't exit capture mode. In this example code we weill just call the function so
#            many times.
max_fifo_size = scope.adc.oa.hwMaxSamples
segments_to_capture = round(max_fifo_size / scope.adc.samples + 1)

target.simpleserial_write('s', struct.pack(">H", segments_to_capture))

start = time.time() #perf reasons only

scope.arm()
target.simpleserial_write('f', bytearray([0]*16))
scope.capture_segmented() 

target.simpleserial_read('r', target.output_len)

# Get segments now
segs = scope.get_last_trace_segmented()

end = time.time() #perf reasons only

print("Captured %d segments of %d points each: "%(len(segs), len(segs[0])))
print("  %fs time"%(end-start))
print("  %d traces/second"%(len(segs) / (end-start)))


In [None]:
%matplotlib notebook
import matplotlib.pylab as plt

#For reference - you can see all the traces here as one (how it's read from teh buffer)

wave = scope.get_last_trace()
plt.figure()
plt.plot(wave)

In [None]:
#Example of reading the segments with `get_last_trace_segmented()` that does the work for you.
plt.figure()
for seg in segs:
    plt.plot(seg)

## Example with slow target communications

The following uses the normal serial code. You'll find the overhead kills you - you only get ~42 traces/second. You can improve that by switching to SS_V2 which is a binary protocol. But the speed will still be limited here.

In [None]:
import time
#IMPORTANT - we now need to generate enough triggers such that scope.adc.samples * NUM_TRIGGERS > max_fifo_size
#            If not the HW won't exit capture mode. In this example code we weill just call the function so
#            many times.
max_fifo_size = scope.adc.oa.hwMaxSamples
segments_to_capture = round(max_fifo_size / scope.adc.samples + 1)

start = time.time() #perf reasons only

scope.arm()

for i in range(0, segments_to_capture):
    target.simpleserial_write('p', bytearray([0]*16))
    target.simpleserial_read('r', target.output_len)

capdone = time.time() #perf reasons only

scope.capture_segmented() 

# Get segments now
segs = scope.get_last_trace_segmented()

end = time.time() #perf reasons only

print("Captured %d segments of %d points each: "%(len(segs), len(segs[0])))
print("  %fs capture time"%(capdone-start))
print("  %fs dl/proc time"%(end-capdone))
print("  %d traces/second"%(len(segs) / (end-start)))

In [None]:
%matplotlib notebook
import matplotlib.pylab as plt

#For reference - you can see all the traces here as one (how it's read from teh buffer)

wave = scope.get_last_trace()
plt.figure()
plt.plot(wave)

In [None]:
import numpy as np
seg_len = scope.adc.samples
num_seg = int(len(wave) / seg_len)
segs = np.reshape(wave[:num_seg*seg_len], (num_seg, seg_len))

In [None]:
#Example of reading the segments with `get_last_trace_segmented()` that does the work for you.
plt.figure()
for seg in segs:
    plt.plot(seg[200:300])