# MSP430 SCA Version 4

This code was developed to aid in performing a side channel attack (SCA) on AES128 encryption on an MSP430. Code from the following sources was either directly copied or used as an example for this code:
- https://wiki.newae.com/V4:Tutorial_B6_Breaking_AES_(Manual_CPA_Attack)
- Chatgpt

Version 4 includes the following updates from version 3:
- Averaging for copies of the same plaintext was implemented
- Files are still sorted in order but instead of by date modified, they are saved in order of the last substring containing the date and time the file was saved to the oscilloscope
- The first S-box flag is used to begin power trace data collection, but a fixed length is given to collect the remaining power trace data rather than using the second S-box flag
- The plaintext and power trace numpy arrays are saved after being created for future use 

__*This code will be used several times to save different sets of power trace and plaintext data. Make sure to change the filenames of the numpy arrays that are being saved when new data is being saved, otherwise the old data will be overwritten.__

## Library Import

In [148]:
# for numpy array and plotting
import numpy as np
import matplotlib.pyplot as plt

# for list of plaintext
import os

# for converting from .mat to .py
from scipy.io import loadmat
from datetime import datetime

# for program runtime
import time

## Data Preprocessing

### Load files

This code pulls data from an external SSD that we are using to collect data, because my computer doesn't have enough storage for all of the files to be downloaded locally. It is organizing the files by a substring in the name that lists the date and time the files were saved.

In [149]:
# Specify the path to your folder
folder_path = 'E:\srand1_1000'

# Get a list of filenames in the folder
file_names = os.listdir(folder_path)

# Sort filenames by the numbers at the end
file_names = sorted(file_names, key=lambda x: int(x.split('_')[-1].split('.')[0]))

print(f"Last filename: {file_names[len(file_names)-1]}")
print(f"Number of files: {len(file_names)}")

FileNotFoundError: [WinError 3] The system cannot find the path specified: 'E:\\srand1_1000'

### Format files

This code is sorting through the MSP430 digital data to organize the plaintexts with their corresponding analog power trace data. The power trace data is also being semi-filtered, just through standardizing the length of the power trace data saved as covered in the version 4 update description. Copies of the same plaintext are also being averaged by the plaintext.

In [147]:
bit_pos = ["D0", "D1", "D2", "D3", "D4", "D5", "D6", "D7"] # bit position
decimal_plaintext = []
previous_plaintext = []
traces_3d = []
five_traces = []
avg_trace = []

# function to convert hex plaintext to decimal plaintext
def hex_to_decimal(hex_str):
    hex_pairs = [hex_str[i:i+2] for i in range(0, len(hex_str), 2)]
    decimal_numbers = [int(pair, 16) for pair in hex_pairs]
    return decimal_numbers

start_time = time.time()

# organizing files
for j in range(len(file_names)):
    mat_data = loadmat('E:\srand1_1000/' + file_names[j])
    
    # sorting based on ch4/ch5
    if "ch" in file_names[j]:
        if "ch4" in file_names[j]:
            plaintext_col = mat_data[bit_pos[7]]
            init = 6  
            tot = 3
        elif "ch5" in file_names[j]:
            init = 4 
            tot = 5
        else:
            print(file_names[j] + " not loaded")
            break
        
        # stacking the column vector of plaintext data
        for k in range(tot):
            plaintext_col = np.column_stack((plaintext_col, mat_data[bit_pos[init - k]]))
    
    # these are the math files
    else:
        # save power trace data
        single_trace = mat_data['data']
        
        # transpose plaintext matrix so it is in the correct format
        plaintext = plaintext_col.T
        
        # identify the plaintext flags and sample plaintext data at those points
        my_array = plaintext[4] # this is the array that contains the plaintext flag
        positions = []
        
        # finds the positions in the array where there is a 0->1 transition
        for i, value in enumerate(my_array[:-1]):
            if value == 0 and my_array[i + 1] == 1:
                positions.append(i + 1)
        
        # Remove the plaintext flags from plaintext data
        data = plaintext[np.arange(plaintext.shape[0]) != 4]  # Example array, replace with your actual data

        # Extract values using array indexing
        result_values = data[np.arange(data.shape[0])[:, None], positions]
        plaintext_bits = []
        
        # Append the plaintext data based on the flag bits together
        for x in range(len(result_values[0])):
            for y in range(len(result_values)):
                plaintext_bits.append(result_values[y][x])

        # convert the plaintext_bits list to a numpy array
        plaintext_hex = np.array(plaintext_bits)

        # Reshape the array into chunks of 4 bits
        bits_matrix = plaintext_hex.reshape(-1, 4)

        # Convert each chunk to its hexadecimal representation
        hex_list = [''.join(map(str, chunk)) for chunk in bits_matrix]
        hex_numbers = [hex(int(chunk, 2))[2:] for chunk in hex_list]
        hex_string = ''.join(hex_numbers[0:32])
        
        # start previous_hex_string on the right plaintext
        if j == 0:
            previous_hex_string = hex_string
        
        # Convert the hex string into decimal representation for CPA algorithm
        decimal_numbers = hex_to_decimal(hex_string)
        
        if previous_plaintext == decimal_numbers or len(previous_plaintext) == 0:
            try:
                five_traces.append(single_trace[positions[16]:positions[16]+1127500]) # this is to standardize the length
                print(f"Copy {len(five_traces)} of unique trace {len(traces_3d)}")
            except IndexError as e:
                # Handle the case where the array doesn't have 18 elements
                print(f"Error: Invalid plaintext {previous_hex_string}, not inlcuded in pre-averaging array")
            previous_plaintext = decimal_numbers
            previous_hex_string = hex_string
        else:
            # average the five traces
            avg_trace = np.mean(np.array(five_traces), axis=0).reshape((1, -1))
            
            # append avgeraged array to traces matrix & previous plaintext (the one for the averaged group) to plaintext matrix
            traces_3d.append(avg_trace)
            decimal_plaintext.append(previous_plaintext)
            
            # reset five traces matrix and update previous plaintext
            five_traces = []
            previous_plaintext = decimal_numbers
            
            print(f"Updating traces and plaintext {previous_hex_string}")
            print(f"Traces size: {len(traces_3d)} | Plaintext size: {len(decimal_plaintext)}\n")
            
            # load new trace data
            try:
                five_traces.append(single_trace[positions[16]:positions[16]+1127500]) # this is to standardize the length
                print(f"Copy {len(five_traces)} of unique trace {len(traces_3d)}")
            except IndexError as e:
                # Handle the case where the array doesn't have 18 elements
                print(f"Error: Invalid plaintext {previous_hex_string}, not inlcuded in pre-averaging array")
                
            previous_hex_string = hex_string

end_time = time.time()
elapsed_time = end_time - start_time

print(f"Elapsed Time: {elapsed_time} seconds")        

FileNotFoundError: [Errno 2] No such file or directory: 'E:\\srand1_1000/power_trace_ch4_20240214143346983.mat'

### Additional Filterting and Conversion to Numpy Array

This section is new from version 3. It implements some further filtering of files that got through the first round of filtering but are still invalid data. The plaintext and powertrace data is also saved as a numpy array here that can be accessed externally from the program as described in the version 4 updates section.
__Make sure to rename the numpy arrays when new data is being saved so they are not overwritten.__

In [142]:
traces_3d_filtered = [elem for elem in traces_3d if elem.shape[1] == 1127500]
decimal_plaintext_filtered = [corr_elem for i, corr_elem in enumerate(decimal_plaintext) if traces_3d[i].shape[1] == 1127500]

traces = np.array(traces_3d_filtered)
traces = np.squeeze(traces, axis=1)

np.save('traces1_1000.npy', traces)
np.save('plaintext1_1000.npy', decimal_plaintext_filtered)

## Attack Algorithm

The attack algorithm. The variables should be formatted correctly in data preprocessing.

In [138]:
#Lookup table for number of 1's in binary numbers 0-256
HW = [bin(n).count("1") for n in range(0,256)] 

sbox=(
0x63,0x7c,0x77,0x7b,0xf2,0x6b,0x6f,0xc5,0x30,0x01,0x67,0x2b,0xfe,0xd7,0xab,0x76,
0xca,0x82,0xc9,0x7d,0xfa,0x59,0x47,0xf0,0xad,0xd4,0xa2,0xaf,0x9c,0xa4,0x72,0xc0,
0xb7,0xfd,0x93,0x26,0x36,0x3f,0xf7,0xcc,0x34,0xa5,0xe5,0xf1,0x71,0xd8,0x31,0x15,
0x04,0xc7,0x23,0xc3,0x18,0x96,0x05,0x9a,0x07,0x12,0x80,0xe2,0xeb,0x27,0xb2,0x75,
0x09,0x83,0x2c,0x1a,0x1b,0x6e,0x5a,0xa0,0x52,0x3b,0xd6,0xb3,0x29,0xe3,0x2f,0x84,
0x53,0xd1,0x00,0xed,0x20,0xfc,0xb1,0x5b,0x6a,0xcb,0xbe,0x39,0x4a,0x4c,0x58,0xcf,
0xd0,0xef,0xaa,0xfb,0x43,0x4d,0x33,0x85,0x45,0xf9,0x02,0x7f,0x50,0x3c,0x9f,0xa8,
0x51,0xa3,0x40,0x8f,0x92,0x9d,0x38,0xf5,0xbc,0xb6,0xda,0x21,0x10,0xff,0xf3,0xd2,
0xcd,0x0c,0x13,0xec,0x5f,0x97,0x44,0x17,0xc4,0xa7,0x7e,0x3d,0x64,0x5d,0x19,0x73,
0x60,0x81,0x4f,0xdc,0x22,0x2a,0x90,0x88,0x46,0xee,0xb8,0x14,0xde,0x5e,0x0b,0xdb,
0xe0,0x32,0x3a,0x0a,0x49,0x06,0x24,0x5c,0xc2,0xd3,0xac,0x62,0x91,0x95,0xe4,0x79,
0xe7,0xc8,0x37,0x6d,0x8d,0xd5,0x4e,0xa9,0x6c,0x56,0xf4,0xea,0x65,0x7a,0xae,0x08,
0xba,0x78,0x25,0x2e,0x1c,0xa6,0xb4,0xc6,0xe8,0xdd,0x74,0x1f,0x4b,0xbd,0x8b,0x8a,
0x70,0x3e,0xb5,0x66,0x48,0x03,0xf6,0x0e,0x61,0x35,0x57,0xb9,0x86,0xc1,0x1d,0x9e,
0xe1,0xf8,0x98,0x11,0x69,0xd9,0x8e,0x94,0x9b,0x1e,0x87,0xe9,0xce,0x55,0x28,0xdf,
0x8c,0xa1,0x89,0x0d,0xbf,0xe6,0x42,0x68,0x41,0x99,0x2d,0x0f,0xb0,0x54,0xbb,0x16)

In [139]:
def intermediate(pt, keyguess):
    return sbox[pt ^ keyguess]

pt = decimal_plaintext_filtered

numtraces = np.shape(traces)[0]-1
numpoint = np.shape(traces)[1]

#Use less than the maximum traces by setting numtraces to something
#numtraces = 15

bestguess = [0]*16

start_time = time.time()
#Set 16 to something lower (like 1) to only go through a single subkey & save time!
for bnum in range(0, 4):
    cpaoutput = [0]*256
    maxcpa = [0]*256
    for kguess in range(0, 256):
        print ("Subkey %2d, hyp = %02x: "%(bnum, kguess))


        #Initialize arrays & variables to zero
        sumnum = np.zeros(numpoint)
        sumden1 = np.zeros(numpoint)
        sumden2 = np.zeros(numpoint)

        hyp = np.zeros(numtraces)
        for tnum in range(0, numtraces):
            hyp[tnum] = HW[intermediate(pt[tnum][bnum], kguess)]


        #Mean of hypothesis
        meanh = np.mean(hyp, dtype=np.float64)

        #Mean of all points in trace
        meant = np.mean(traces, axis=0, dtype=np.float64)

        #For each trace, do the following
        for tnum in range(0, numtraces):
            hdiff = (hyp[tnum] - meanh)
            tdiff = traces[tnum,:] - meant

            sumnum = sumnum + (hdiff*tdiff)
            sumden1 = sumden1 + hdiff*hdiff 
            sumden2 = sumden2 + tdiff*tdiff

        cpaoutput[kguess] = sumnum / np.sqrt( sumden1 * sumden2 )
        maxcpa[kguess] = max(abs(cpaoutput[kguess]))

        print (maxcpa[kguess])

    #Find maximum value of key
    bestguess[bnum] = np.argmax(maxcpa)

print ("Best Key Guess: ")
for b in bestguess: 
    print ("%02x "%b)

end_time = time.time()
elapsed_time = end_time - start_time

print(f"Elapsed Time: {elapsed_time} seconds")

Subkey  0, hyp = 00: 
0.14277403706870606
Subkey  0, hyp = 01: 
0.16200707687937874
Subkey  0, hyp = 02: 
0.15895337015953015
Subkey  0, hyp = 03: 
0.14601586429513125
Subkey  0, hyp = 04: 
0.15054969520617814
Subkey  0, hyp = 05: 
0.14257308816101677
Subkey  0, hyp = 06: 
0.14138446864439325
Subkey  0, hyp = 07: 
0.16066741028069362
Subkey  0, hyp = 08: 
0.14787487227971416
Subkey  0, hyp = 09: 
0.1495489515503309
Subkey  0, hyp = 0a: 
0.1590478501576447
Subkey  0, hyp = 0b: 
0.13210482265704104
Subkey  0, hyp = 0c: 
0.15426787309973808
Subkey  0, hyp = 0d: 
0.14313940055098587
Subkey  0, hyp = 0e: 
0.1539923416532439
Subkey  0, hyp = 0f: 
0.14929170807532396
Subkey  0, hyp = 10: 
0.1530708589029247
Subkey  0, hyp = 11: 
0.15818285615685745
Subkey  0, hyp = 12: 
0.14972011080741324
Subkey  0, hyp = 13: 
0.16239194429299172
Subkey  0, hyp = 14: 
0.14751173274895465
Subkey  0, hyp = 15: 
0.13769266037295094
Subkey  0, hyp = 16: 
0.14686485574541683
Subkey  0, hyp = 17: 
0.15125630872320

0.16609301977906746
Subkey  0, hyp = c5: 
0.1572715121393675
Subkey  0, hyp = c6: 
0.14603872916217953
Subkey  0, hyp = c7: 
0.14705554889436015
Subkey  0, hyp = c8: 
0.14934154371788408
Subkey  0, hyp = c9: 
0.15019745032763515
Subkey  0, hyp = ca: 
0.15447558396200062
Subkey  0, hyp = cb: 
0.1549125455648357
Subkey  0, hyp = cc: 
0.14460819718037823
Subkey  0, hyp = cd: 
0.15087764156130035
Subkey  0, hyp = ce: 
0.15527645825188544
Subkey  0, hyp = cf: 
0.1659279417548766
Subkey  0, hyp = d0: 
0.1498589460537063
Subkey  0, hyp = d1: 
0.14678799416388616
Subkey  0, hyp = d2: 
0.14778938083572965
Subkey  0, hyp = d3: 
0.14828445825288647
Subkey  0, hyp = d4: 
0.15122416978655898
Subkey  0, hyp = d5: 
0.14524914688299728
Subkey  0, hyp = d6: 
0.14161593660219687
Subkey  0, hyp = d7: 
0.14125251595334057
Subkey  0, hyp = d8: 
0.15870693804537295
Subkey  0, hyp = d9: 
0.14355696919579364
Subkey  0, hyp = da: 
0.15520332706696746
Subkey  0, hyp = db: 
0.15044867705341722
Subkey  0, hyp = d

0.16087592876826037
Subkey  1, hyp = 8a: 
0.14610412567660977
Subkey  1, hyp = 8b: 
0.1443829579461856
Subkey  1, hyp = 8c: 
0.14884082008234684
Subkey  1, hyp = 8d: 
0.1474692972289952
Subkey  1, hyp = 8e: 
0.14909228798606874
Subkey  1, hyp = 8f: 
0.15414081073401603
Subkey  1, hyp = 90: 
0.14775751060247685
Subkey  1, hyp = 91: 
0.15019000463317286
Subkey  1, hyp = 92: 
0.1425500312093572
Subkey  1, hyp = 93: 
0.16590336883578652
Subkey  1, hyp = 94: 
0.14515948977639354
Subkey  1, hyp = 95: 
0.15479276556625715
Subkey  1, hyp = 96: 
0.17046015200237677
Subkey  1, hyp = 97: 
0.15391396632655047
Subkey  1, hyp = 98: 
0.14978338768573224
Subkey  1, hyp = 99: 
0.16151521819704326
Subkey  1, hyp = 9a: 
0.15309996069464615
Subkey  1, hyp = 9b: 
0.16016141911220808
Subkey  1, hyp = 9c: 
0.15311750912034602
Subkey  1, hyp = 9d: 
0.1575126483065015
Subkey  1, hyp = 9e: 
0.14431710791133123
Subkey  1, hyp = 9f: 
0.1621167800575135
Subkey  1, hyp = a0: 
0.15884583597322918
Subkey  1, hyp = a1

0.1493570517882596
Subkey  2, hyp = 4f: 
0.14072548436491158
Subkey  2, hyp = 50: 
0.17337547766462655
Subkey  2, hyp = 51: 
0.15615200367646553
Subkey  2, hyp = 52: 
0.14502330341220926
Subkey  2, hyp = 53: 
0.1484302750886027
Subkey  2, hyp = 54: 
0.14729049262862565
Subkey  2, hyp = 55: 
0.14302570931977174
Subkey  2, hyp = 56: 
0.16086457656073783
Subkey  2, hyp = 57: 
0.14295541563728792
Subkey  2, hyp = 58: 
0.1455366182503501
Subkey  2, hyp = 59: 
0.15494589821041377
Subkey  2, hyp = 5a: 
0.1471974804659412
Subkey  2, hyp = 5b: 
0.14971055130625008
Subkey  2, hyp = 5c: 
0.150954735218364
Subkey  2, hyp = 5d: 
0.15163972430612604
Subkey  2, hyp = 5e: 
0.15084336138198495
Subkey  2, hyp = 5f: 
0.14479905831809822
Subkey  2, hyp = 60: 
0.15305571959436517
Subkey  2, hyp = 61: 
0.16387801652545295
Subkey  2, hyp = 62: 
0.14896325320074857
Subkey  2, hyp = 63: 
0.15936443977903528
Subkey  2, hyp = 64: 
0.16017923982416518
Subkey  2, hyp = 65: 
0.14202077498428775
Subkey  2, hyp = 66:

0.1548833134727208
Subkey  3, hyp = 14: 
0.15490055715122455
Subkey  3, hyp = 15: 
0.14640677596422727
Subkey  3, hyp = 16: 
0.14089651560037372
Subkey  3, hyp = 17: 
0.1503644402077097
Subkey  3, hyp = 18: 
0.14736939488509015
Subkey  3, hyp = 19: 
0.15107873603046854
Subkey  3, hyp = 1a: 
0.1680918711503066
Subkey  3, hyp = 1b: 
0.1505552822697434
Subkey  3, hyp = 1c: 
0.1503753826391662
Subkey  3, hyp = 1d: 
0.15195947363421594
Subkey  3, hyp = 1e: 
0.15653536083220446
Subkey  3, hyp = 1f: 
0.13947436732452131
Subkey  3, hyp = 20: 
0.13727938850021273
Subkey  3, hyp = 21: 
0.1725768940288738
Subkey  3, hyp = 22: 
0.15971510281306983
Subkey  3, hyp = 23: 
0.1516829614053147
Subkey  3, hyp = 24: 
0.1414516919656029
Subkey  3, hyp = 25: 
0.15975404862987347
Subkey  3, hyp = 26: 
0.14922211593739657
Subkey  3, hyp = 27: 
0.1508723276798864
Subkey  3, hyp = 28: 
0.14222066638123732
Subkey  3, hyp = 29: 
0.15215911098028406
Subkey  3, hyp = 2a: 
0.1657983929773024
Subkey  3, hyp = 2b: 
0.

0.16701896907403085
Subkey  3, hyp = d9: 
0.15057379500238216
Subkey  3, hyp = da: 
0.14717691312319744
Subkey  3, hyp = db: 
0.1554984630194788
Subkey  3, hyp = dc: 
0.14798753791084934
Subkey  3, hyp = dd: 
0.14778994492465777
Subkey  3, hyp = de: 
0.15414425360192727
Subkey  3, hyp = df: 
0.15810020691865403
Subkey  3, hyp = e0: 
0.1465289148226388
Subkey  3, hyp = e1: 
0.15531652565966836
Subkey  3, hyp = e2: 
0.1523770200300972
Subkey  3, hyp = e3: 
0.14356078731828167
Subkey  3, hyp = e4: 
0.15074215685150735
Subkey  3, hyp = e5: 
0.1454550182446906
Subkey  3, hyp = e6: 
0.1583527099355933
Subkey  3, hyp = e7: 
0.1530425119999748
Subkey  3, hyp = e8: 
0.16033995317022487
Subkey  3, hyp = e9: 
0.15331162118184383
Subkey  3, hyp = ea: 
0.149775351274827
Subkey  3, hyp = eb: 
0.1491946340212457
Subkey  3, hyp = ec: 
0.1426336926792609
Subkey  3, hyp = ed: 
0.14287787088547352
Subkey  3, hyp = ee: 
0.14287665821237513
Subkey  3, hyp = ef: 
0.13946631964705467
Subkey  3, hyp = f0: 
0.