## A Neural Network based Classifier with a Sliding Widnow Sampler for EM Traces

This notebook introduces a software activity classifier of IoT devices through their EM emissions. The training phase of the neural network uses labeled Fourier Transforms of EM traces while the testing phase using a sliding window to sample new data from live EM data streams as testing data.

### EM Trace Acquisition - (doesn't work temporarily)

We use a client Shell script at the target device as a UDP client and the following python script as the UDP server at the host computer. The Client Shell script notify the beginning and ending of a specific software activity based on which the UDP server takes small EM trace files for each activity region. The following UDP server utilize a GNURadio Companion (GRC) scipt file named as **arduino-data-acquisition.grc** underneath by importing the **top_block.py** file which is generated by the GRC file initially.

After a sufficient number of EM traces are saved, they should be manually copied in to the correct class directory in **./data/em-traces/**, where there's a directory for each class.

### Doesn't work:
We need GNURadio 3.8 to generate a Python3-supported "to_block.py" code. Currently I have GNURadio 3.7 and therefore it works only on Python2.

### Solution:
The same following code is saved as a separate file name "em-trace-capture.py". Run it on the terminal separately with Python2 as follows.

**python2 em-trace-capture.py**


In [None]:
import top_block
import time

print("Creating top_block class...")
tb=top_block.top_block()

print("Starting top_block...")
tb.start()
print("Started...")

while True:
    print("starting...")
    tb.set_trigger(1)
    time.sleep(0.025)
    tb.set_trigger(-1)     
    print("")

### EM Trace Pre-Processing

In this stage, we take EM traces from every class and then extract a smaller segment from it which is then Fourier Transformed, averaged to a 500 element vector and normalized. Finally, this vector is saved as a numpy array (.npy) as a feature vector with the file name **\[class_label\].\[sequence_number\].npy** in order to be used for training a classifier later. Following code should be run each time to create the training samples of a particular class by uncommenting the label of the class from the code.

In [1]:
import seciqlib as iq
import numpy as np
import matplotlib.pyplot as plt
import os

# Uncomment the correct label in order to generate the training data of that class
#label='loop1' 
#label='loop2'
#label='loop3'
#label='loop4'
#label='loop5'
#label='loop6'
#label='loop7'
#label='loop8'
#label='loop9'
#label='loop10'

# the position in the EM traces from which we take a little segment
offset_time = 0
window_time = 0.01

# the directory from where the EM traces are taken
path_to_em_traces = "./data/em-traces/"+label
# the directory to store the training samples
path_to_training_samples = "./data/training-samples"
    
listOfFiles = os.listdir(path_to_em_traces)
traceCounter = 0

for file_name in listOfFiles:
    segment = iq.getSegmentData(path_to_em_traces+"/"+file_name, offset_time, window_time)
    feature_vector = iq.getFeatureVector(segment)
    #plt.figure()
    #plt.plot(feature_vector)
    #plt.show()    
    np.save(path_to_training_samples+"/"+label+"."+str(traceCounter), feature_vector)  
    traceCounter = traceCounter + 1
    
print("Done!")

NameError: name 'label' is not defined

### Training and Testing the Classifier

All the training samples are read from the files and loaded into a 2-d array and fed into the Neural Network-based classifier. A simple train and test operation or a 10-fold cross-validation can be performed.

In [1]:
import nnclassifier as nn

# the directory to store the training samples
path_to_training_samples = "./data/training-samples"

# loading the training samples to memory
X, Y = nn.loadDataToXY(path_to_training_samples)

# creating classifier
clf = nn.createClassifier()

# 10-fold cross-validation
#nn.tenFoldCrossValidation(clf, X, Y)

# Training and testing
nn.trainAndTest(clf, X, Y)

[[156   0   0   0   0   0   1   0   0   0]
 [  0 144   0   0   1   0   0   0   0   0]
 [  0   0 120   0   0   1  11   0   0   4]
 [  0   0   0 153   1   0   1   0   0   0]
 [  0   0   0   0 135   0   1   0   0   0]
 [  0   0   0   2   1 149   0   0   0   0]
 [  0   0   5   0   0   0 108   3   0  37]
 [  0   0   0   0   0   0   1 159   0   0]
 [  0   0   0   0   0   0   0   0 167   0]
 [  0   0   1   0   0   0  32   1   0 105]]
             precision    recall  f1-score   support

      loop1       1.00      0.99      1.00       157
     loop10       1.00      0.99      1.00       145
      loop2       0.95      0.88      0.92       136
      loop3       0.99      0.99      0.99       155
      loop4       0.98      0.99      0.99       136
      loop5       0.99      0.98      0.99       152
      loop6       0.70      0.71      0.70       153
      loop7       0.98      0.99      0.98       160
      loop8       1.00      1.00      1.00       167
      loop9       0.72      0.76      

### Saving the trained model into a file

In [2]:
from joblib import dump, load

# Saving the trained model to a file
dump(clf, "10-class-neural-network-model-joblib.model")

['10-class-neural-network-model-joblib.model']

### Testing Classifier with Sliding Window Data - Stopped here

The trained classifier is used in order to test a new and longer EM trace which can contain any unknown software activity related information. A sliding window is used to extract segments of data from the EM trace which are fed to the classifier to detect the software activity.

In [None]:
import seciqlib as iq
import numpy as np
import matplotlib.pyplot as plt
import os

# the initial position in the EM trace segment
offset_time = 0
window_time = 0.01

'''
label='3des'
# the directory from where the EM traces are taken
path_to_em_traces = "./data/em-traces/"+label
# testing EM trace name
file_name='file16_136_244.44975915.dat'
'''

label='loop1'
# the directory from where the EM traces are taken
path_to_em_traces = "./data/sliding-window-traces/"+label
# testing EM trace name
file_name='file16_102_134.48857225.dat'

duration = iq.getTimeDuration(path_to_em_traces+'/'+file_name)

while (offset_time + window_time) < duration:
    print(offset_time)
    segment = iq.getSegmentData(path_to_em_traces+"/"+file_name, offset_time, window_time)
    feature_vector = iq.getFeatureVector(segment)
    y_pred = nn.predictClass(clf, [feature_vector.tolist()])
    print(y_pred)
    offset_time = offset_time + 0.001
    