# Arduino Program Classification

In this Jupyter-Notebook, we are expected to apply the knowledge you have gained so far in order to build an ML classifier that can distinguish between 5 different programs running on an Arduino device. We are provided with 5 EM trace files in NumPy format. These data are captured with a HackRF SDR device with a sampling rate of 20MHz. Since ZMQ sockets were used during the data capture, the data file format is NumPy.

#### Import the libraries

In [None]:
import matplotlib.pyplot as plt
import numpy as np
from emvincelib import iq, ml, stat
from sklearn import svm
from sklearn.neural_network import MLPClassifier
from scipy.fftpack import fft
from sklearn import preprocessing

%matplotlib inline

### 1. Visualizing the 5 EM Data Files

As the first move, let's plot the data in each file in order to get an idea on what they look like. Let's plot power spectral density (PSD) this time.

#### Visualization of file 1

In [None]:
iq.sampleRate = 20e6

file1 = "./data/arduino-program-classification/activity-1.npy"

duration1 = iq.getTimeDuration(file1, fileType="npy")                      
print("Time duration of the npy file: " + str(duration1) + " seconds")

data1 = iq.getSegmentData(file1, 0, duration1, fileType='npy')

length = len(data1)
print("Number of samples in numpy data: " + str(length))

iq.plotPSD(data1)

#### Visualization of file 2

In [None]:
file2 = "./data/arduino-program-classification/activity-2.npy"

duration2 = iq.getTimeDuration(file2, fileType="npy")                      
print("Time duration of the npy file: " + str(duration2) + " seconds")

data2 = iq.getSegmentData(file2, 0, duration2, fileType='npy')

length = len(data2)
print("Number of samples in numpy data: " + str(length))

iq.plotPSD(data2)

#### Visualization of file 3

In [None]:
file3 = "./data/arduino-program-classification/activity-3.npy"

duration3 = iq.getTimeDuration(file3, fileType="npy")                      
print("Time duration of the numpy file: " + str(duration3) + " seconds")

data3 = iq.getSegmentData(file3, 0, duration3, fileType='npy')

length = len(data3)
print("Number of samples in numpy data: " + str(length))

iq.plotPSD(data3)

#### Visualization of file 4

In [None]:
file4 = "./data/arduino-program-classification/activity-4.npy"

duration4 = iq.getTimeDuration(file4, fileType="npy")                      
print("Time duration of the numpy file: " + str(duration4) + " seconds")

data4 = iq.getSegmentData(file4, 0, duration4, fileType='npy')

length = len(data4)
print("Number of samples in numpy data: " + str(length))

iq.plotPSD(data4)

#### Visualization of file 5

In [None]:
file5 = "./data/arduino-program-classification/activity-5.npy"

duration5 = iq.getTimeDuration(file5, fileType="npy")                      
print("Time duration of the numpy file: " + str(duration5) + " seconds")

data5 = iq.getSegmentData(file5, 0, duration5, fileType='npy')

length = len(data5)
print("Number of samples in NumPy data: " + str(length))

iq.plotPSD(data5)

### 2. Training and Testing Machine Learning Model

In [None]:
#iq.sampleRate = 20e6
sliding_window = 0.01
feature_vector_size = 1000

ml.loadTrainingData(file1, iq.sampleRate, feature_vector_size, sliding_window, duration1, "Class 1")
ml.loadTrainingData(file2, iq.sampleRate, feature_vector_size, sliding_window, duration2, "Class 2")
ml.loadTrainingData(file3, iq.sampleRate, feature_vector_size, sliding_window, duration2, "Class 3")
ml.loadTrainingData(file4, iq.sampleRate, feature_vector_size, sliding_window, duration2, "Class 4")
ml.loadTrainingData(file5, iq.sampleRate, feature_vector_size, sliding_window, duration2, "Class 5")

clf = ml.createClassifier()
ml.trainAndTest(clf)