<h1>Week 8: Deploying Machine-Learning Solutions in IoT</h1>

<hr>
In Week 7, we covered the fundamentals of Machine Learning, exploring key concepts and building simple models using the Titanic Survival dataset. We also learned how to make predictions using these models, though our work was limited to a Jupyter Notebook environment. This week, we will take it a step further by deploying a machine learning model on a Raspberry Pi. We will use real-time data collected from the ADXL343 sensor to make live predictions, bringing our models from theory to practical application in IoT.

This week, you'll work with a new dataset that contains 3-axis acceleration data for Human Activity Recognition. This data was collected at a sampling rate of 20Hz, with each sample representing a 3-second window. The dataset includes two activity labels:

- Walking: This category encompasses various walking activities, including walking up or down stairs, looking at a phone, and engaging in social interactions.
- Sitting: This label covers periods when the person was seated, performing general office tasks.

You can download the CSV file from this week's module. Once downloaded, you can begin examining the data using Pandas.
<hr>

In [14]:
import pandas as pd

df = pd.read_csv("week8_data.csv")
df.head()

Unnamed: 0,x_values,y_values,z_values,label
0,"[-1.2552512, -1.2160246, -1.0983448, -1.216024...","[1.6475172, 1.2160246, 1.6475172, 0.9414384, 1...","[8.7867584, 8.7867584, 8.747531799999999, 8.82...",walking
1,"[-1.1767979999999998, -1.1767979999999998, -1....","[1.2944778, 1.1767979999999998, 1.3337044, 1.1...","[8.8652116, 8.8652116, 8.825985, 8.9044382, 8....",walking
2,"[-1.0983448, -1.0983448, -1.1375714, -1.098344...","[1.0591182, 1.2160246, 1.2944778, 1.0983448, 1...","[8.9044382, 8.8652116, 8.8652116, 8.8652116, 8...",walking
3,"[-1.1767979999999998, -1.1767979999999998, -1....","[1.2552512, 1.1375714, 1.1375714, 1.1767979999...","[8.7867584, 8.825985, 8.825985, 8.825985, 8.82...",walking
4,"[0.2353596, 0.2353596, 0.3138128, 0.2745862, 0...","[-0.196133, -0.196133, -0.196133, -0.2353596, ...","[9.1005712, 9.0613446, 9.0613446, 9.1005712, 9...",walking


In [15]:
df.shape

(2000, 4)

In [16]:
df['label'].value_counts()

label
walking    1000
sitting    1000
Name: count, dtype: int64

<hr>

We can verify that the dataset contains 2,000 samples, with 1,000 labeled as walking and another 1,000 as sitting.

Currently, the values are stored as strings in the DataFrame columns. Before processing, we'll need to convert these string values back into Python lists.

<hr>

In [17]:
import ast

# convert string back into lists
df['x_values'] = df['x_values'].apply(ast.literal_eval)
df['y_values'] = df['y_values'].apply(ast.literal_eval)
df['z_values'] = df['z_values'].apply(ast.literal_eval)

In [18]:
df

Unnamed: 0,x_values,y_values,z_values,label
0,"[-1.2552512, -1.2160246, -1.0983448, -1.216024...","[1.6475172, 1.2160246, 1.6475172, 0.9414384, 1...","[8.7867584, 8.7867584, 8.747531799999999, 8.82...",walking
1,"[-1.1767979999999998, -1.1767979999999998, -1....","[1.2944778, 1.1767979999999998, 1.3337044, 1.1...","[8.8652116, 8.8652116, 8.825985, 8.9044382, 8....",walking
2,"[-1.0983448, -1.0983448, -1.1375714, -1.098344...","[1.0591182, 1.2160246, 1.2944778, 1.0983448, 1...","[8.9044382, 8.8652116, 8.8652116, 8.8652116, 8...",walking
3,"[-1.1767979999999998, -1.1767979999999998, -1....","[1.2552512, 1.1375714, 1.1375714, 1.1767979999...","[8.7867584, 8.825985, 8.825985, 8.825985, 8.82...",walking
4,"[0.2353596, 0.2353596, 0.3138128, 0.2745862, 0...","[-0.196133, -0.196133, -0.196133, -0.2353596, ...","[9.1005712, 9.0613446, 9.0613446, 9.1005712, 9...",walking
...,...,...,...,...
1995,"[6.2370294, 6.1978028, 6.1585762, 6.1978028, 6...","[6.7469752000000005, 6.864655, 6.7862018000000...","[3.4911673999999997, 3.5696205999999995, 3.569...",sitting
1996,"[6.2370294, 6.276256, 6.1585762, 6.3154826, 6....","[6.668522, 6.8254284, 7.1000146, 6.5116156, 6....","[3.6088471999999996, 3.6480737999999997, 3.530...",sitting
1997,"[6.1978028, 6.1585762, 6.1585762, 6.2370294, 6...","[6.7077486, 6.864655, 6.786201800000001, 6.903...","[3.6088471999999996, 3.6480737999999997, 3.648...",sitting
1998,"[6.1978028, 6.2370294, 6.1978028, 6.1585762, 6...","[6.864655, 6.786201800000001, 6.78620180000000...","[3.6088471999999996, 3.5696205999999995, 3.530...",sitting


<hr>

As we covered in Week 5, we can proceed with some simple feature extraction on the dataset.

<hr>

In [19]:
# Define helper functions
import numpy as np

def calculate_mean(lst):
    return sum(lst) / len(lst)

def calculate_range(lst):
    return max(lst) - min(lst)

def calculate_correlation(list1, list2):
    return np.corrcoef(list1, list2)[0, 1]

In [20]:
# Calculate mean acceleration in 3 axes

df['x_mean'] = df['x_values'].apply(calculate_mean)
df['y_mean'] = df['y_values'].apply(calculate_mean)
df['z_mean'] = df['z_values'].apply(calculate_mean)

In [21]:
df

Unnamed: 0,x_values,y_values,z_values,label,x_mean,y_mean,z_mean
0,"[-1.2552512, -1.2160246, -1.0983448, -1.216024...","[1.6475172, 1.2160246, 1.6475172, 0.9414384, 1...","[8.7867584, 8.7867584, 8.747531799999999, 8.82...",walking,-1.196411,1.313437,8.807025
1,"[-1.1767979999999998, -1.1767979999999998, -1....","[1.2944778, 1.1767979999999998, 1.3337044, 1.1...","[8.8652116, 8.8652116, 8.825985, 8.9044382, 8....",walking,-1.148686,1.106844,8.848213
2,"[-1.0983448, -1.0983448, -1.1375714, -1.098344...","[1.0591182, 1.2160246, 1.2944778, 1.0983448, 1...","[8.9044382, 8.8652116, 8.8652116, 8.8652116, 8...",walking,-1.143455,1.148686,8.851482
3,"[-1.1767979999999998, -1.1767979999999998, -1....","[1.2552512, 1.1375714, 1.1375714, 1.1767979999...","[8.7867584, 8.825985, 8.825985, 8.825985, 8.82...",walking,-1.117304,1.165684,8.823370
4,"[0.2353596, 0.2353596, 0.3138128, 0.2745862, 0...","[-0.196133, -0.196133, -0.196133, -0.2353596, ...","[9.1005712, 9.0613446, 9.0613446, 9.1005712, 9...",walking,0.271971,-0.209209,9.054153
...,...,...,...,...,...,...,...
1995,"[6.2370294, 6.1978028, 6.1585762, 6.1978028, 6...","[6.7469752000000005, 6.864655, 6.7862018000000...","[3.4911673999999997, 3.5696205999999995, 3.569...",sitting,6.158576,6.832620,3.602309
1996,"[6.2370294, 6.276256, 6.1585762, 6.3154826, 6....","[6.668522, 6.8254284, 7.1000146, 6.5116156, 6....","[3.6088471999999996, 3.6480737999999997, 3.530...",sitting,6.191265,6.794047,3.607540
1997,"[6.1978028, 6.1585762, 6.1585762, 6.2370294, 6...","[6.7077486, 6.864655, 6.786201800000001, 6.903...","[3.6088471999999996, 3.6480737999999997, 3.648...",sitting,6.179497,6.812353,3.601002
1998,"[6.1978028, 6.2370294, 6.1978028, 6.1585762, 6...","[6.864655, 6.786201800000001, 6.78620180000000...","[3.6088471999999996, 3.5696205999999995, 3.530...",sitting,6.101698,6.884922,3.572889


In [22]:
# Calculate acceleration range in 3 axes

df['x_range'] = df['x_values'].apply(calculate_range)
df['y_range'] = df['y_values'].apply(calculate_range)
df['z_range'] = df['z_values'].apply(calculate_range)

In [23]:
df

Unnamed: 0,x_values,y_values,z_values,label,x_mean,y_mean,z_mean,x_range,y_range,z_range
0,"[-1.2552512, -1.2160246, -1.0983448, -1.216024...","[1.6475172, 1.2160246, 1.6475172, 0.9414384, 1...","[8.7867584, 8.7867584, 8.747531799999999, 8.82...",walking,-1.196411,1.313437,8.807025,0.274586,1.216025,0.353039
1,"[-1.1767979999999998, -1.1767979999999998, -1....","[1.2944778, 1.1767979999999998, 1.3337044, 1.1...","[8.8652116, 8.8652116, 8.825985, 8.9044382, 8....",walking,-1.148686,1.106844,8.848213,0.274586,1.059118,0.235360
2,"[-1.0983448, -1.0983448, -1.1375714, -1.098344...","[1.0591182, 1.2160246, 1.2944778, 1.0983448, 1...","[8.9044382, 8.8652116, 8.8652116, 8.8652116, 8...",walking,-1.143455,1.148686,8.851482,0.431493,0.784532,0.431493
3,"[-1.1767979999999998, -1.1767979999999998, -1....","[1.2552512, 1.1375714, 1.1375714, 1.1767979999...","[8.7867584, 8.825985, 8.825985, 8.825985, 8.82...",walking,-1.117304,1.165684,8.823370,0.235360,0.902212,0.313813
4,"[0.2353596, 0.2353596, 0.3138128, 0.2745862, 0...","[-0.196133, -0.196133, -0.196133, -0.2353596, ...","[9.1005712, 9.0613446, 9.0613446, 9.1005712, 9...",walking,0.271971,-0.209209,9.054153,0.156906,0.235360,0.235360
...,...,...,...,...,...,...,...,...,...,...
1995,"[6.2370294, 6.1978028, 6.1585762, 6.1978028, 6...","[6.7469752000000005, 6.864655, 6.7862018000000...","[3.4911673999999997, 3.5696205999999995, 3.569...",sitting,6.158576,6.832620,3.602309,0.627626,1.098345,1.019892
1996,"[6.2370294, 6.276256, 6.1585762, 6.3154826, 6....","[6.668522, 6.8254284, 7.1000146, 6.5116156, 6....","[3.6088471999999996, 3.6480737999999997, 3.530...",sitting,6.191265,6.794047,3.607540,0.313813,0.588399,0.353039
1997,"[6.1978028, 6.1585762, 6.1585762, 6.2370294, 6...","[6.7077486, 6.864655, 6.786201800000001, 6.903...","[3.6088471999999996, 3.6480737999999997, 3.648...",sitting,6.179497,6.812353,3.601002,0.235360,0.313813,0.431493
1998,"[6.1978028, 6.2370294, 6.1978028, 6.1585762, 6...","[6.864655, 6.786201800000001, 6.78620180000000...","[3.6088471999999996, 3.5696205999999995, 3.530...",sitting,6.101698,6.884922,3.572889,0.470719,0.980665,0.392266


In [24]:
# Calculate the three correlations (xy, yz and xz)

df['xy_corr'] = df.apply(lambda row: calculate_correlation(row['x_values'], row['y_values']), axis=1)
df['yz_corr'] = df.apply(lambda row: calculate_correlation(row['y_values'], row['z_values']), axis=1)
df['xz_corr'] = df.apply(lambda row: calculate_correlation(row['x_values'], row['z_values']), axis=1)

In [25]:
df

Unnamed: 0,x_values,y_values,z_values,label,x_mean,y_mean,z_mean,x_range,y_range,z_range,xy_corr,yz_corr,xz_corr
0,"[-1.2552512, -1.2160246, -1.0983448, -1.216024...","[1.6475172, 1.2160246, 1.6475172, 0.9414384, 1...","[8.7867584, 8.7867584, 8.747531799999999, 8.82...",walking,-1.196411,1.313437,8.807025,0.274586,1.216025,0.353039,-0.098106,-0.526352,-0.182488
1,"[-1.1767979999999998, -1.1767979999999998, -1....","[1.2944778, 1.1767979999999998, 1.3337044, 1.1...","[8.8652116, 8.8652116, 8.825985, 8.9044382, 8....",walking,-1.148686,1.106844,8.848213,0.274586,1.059118,0.235360,-0.240468,-0.340499,-0.322237
2,"[-1.0983448, -1.0983448, -1.1375714, -1.098344...","[1.0591182, 1.2160246, 1.2944778, 1.0983448, 1...","[8.9044382, 8.8652116, 8.8652116, 8.8652116, 8...",walking,-1.143455,1.148686,8.851482,0.431493,0.784532,0.431493,0.187816,-0.364041,-0.513297
3,"[-1.1767979999999998, -1.1767979999999998, -1....","[1.2552512, 1.1375714, 1.1375714, 1.1767979999...","[8.7867584, 8.825985, 8.825985, 8.825985, 8.82...",walking,-1.117304,1.165684,8.823370,0.235360,0.902212,0.313813,0.091160,-0.215989,-0.238869
4,"[0.2353596, 0.2353596, 0.3138128, 0.2745862, 0...","[-0.196133, -0.196133, -0.196133, -0.2353596, ...","[9.1005712, 9.0613446, 9.0613446, 9.1005712, 9...",walking,0.271971,-0.209209,9.054153,0.156906,0.235360,0.235360,0.109223,-0.355972,-0.468424
...,...,...,...,...,...,...,...,...,...,...,...,...,...
1995,"[6.2370294, 6.1978028, 6.1585762, 6.1978028, 6...","[6.7469752000000005, 6.864655, 6.7862018000000...","[3.4911673999999997, 3.5696205999999995, 3.569...",sitting,6.158576,6.832620,3.602309,0.627626,1.098345,1.019892,-0.222889,-0.062302,-0.477676
1996,"[6.2370294, 6.276256, 6.1585762, 6.3154826, 6....","[6.668522, 6.8254284, 7.1000146, 6.5116156, 6....","[3.6088471999999996, 3.6480737999999997, 3.530...",sitting,6.191265,6.794047,3.607540,0.313813,0.588399,0.353039,-0.482589,-0.040348,-0.378993
1997,"[6.1978028, 6.1585762, 6.1585762, 6.2370294, 6...","[6.7077486, 6.864655, 6.786201800000001, 6.903...","[3.6088471999999996, 3.6480737999999997, 3.648...",sitting,6.179497,6.812353,3.601002,0.235360,0.313813,0.431493,-0.139266,-0.177081,-0.378796
1998,"[6.1978028, 6.2370294, 6.1978028, 6.1585762, 6...","[6.864655, 6.786201800000001, 6.78620180000000...","[3.6088471999999996, 3.5696205999999995, 3.530...",sitting,6.101698,6.884922,3.572889,0.470719,0.980665,0.392266,-0.443920,-0.060412,-0.089779


In [26]:
# There are only 2 possible labels, we can encode them using a Dict

def encode_label(lbl):
    label_map = {'walking': 1, 'sitting': 0}
    return label_map.get(lbl, -1)  # Returns -1 if the label is not found
    
df['label'] = df['label'].apply(encode_label)

In [27]:
df

Unnamed: 0,x_values,y_values,z_values,label,x_mean,y_mean,z_mean,x_range,y_range,z_range,xy_corr,yz_corr,xz_corr
0,"[-1.2552512, -1.2160246, -1.0983448, -1.216024...","[1.6475172, 1.2160246, 1.6475172, 0.9414384, 1...","[8.7867584, 8.7867584, 8.747531799999999, 8.82...",1,-1.196411,1.313437,8.807025,0.274586,1.216025,0.353039,-0.098106,-0.526352,-0.182488
1,"[-1.1767979999999998, -1.1767979999999998, -1....","[1.2944778, 1.1767979999999998, 1.3337044, 1.1...","[8.8652116, 8.8652116, 8.825985, 8.9044382, 8....",1,-1.148686,1.106844,8.848213,0.274586,1.059118,0.235360,-0.240468,-0.340499,-0.322237
2,"[-1.0983448, -1.0983448, -1.1375714, -1.098344...","[1.0591182, 1.2160246, 1.2944778, 1.0983448, 1...","[8.9044382, 8.8652116, 8.8652116, 8.8652116, 8...",1,-1.143455,1.148686,8.851482,0.431493,0.784532,0.431493,0.187816,-0.364041,-0.513297
3,"[-1.1767979999999998, -1.1767979999999998, -1....","[1.2552512, 1.1375714, 1.1375714, 1.1767979999...","[8.7867584, 8.825985, 8.825985, 8.825985, 8.82...",1,-1.117304,1.165684,8.823370,0.235360,0.902212,0.313813,0.091160,-0.215989,-0.238869
4,"[0.2353596, 0.2353596, 0.3138128, 0.2745862, 0...","[-0.196133, -0.196133, -0.196133, -0.2353596, ...","[9.1005712, 9.0613446, 9.0613446, 9.1005712, 9...",1,0.271971,-0.209209,9.054153,0.156906,0.235360,0.235360,0.109223,-0.355972,-0.468424
...,...,...,...,...,...,...,...,...,...,...,...,...,...
1995,"[6.2370294, 6.1978028, 6.1585762, 6.1978028, 6...","[6.7469752000000005, 6.864655, 6.7862018000000...","[3.4911673999999997, 3.5696205999999995, 3.569...",0,6.158576,6.832620,3.602309,0.627626,1.098345,1.019892,-0.222889,-0.062302,-0.477676
1996,"[6.2370294, 6.276256, 6.1585762, 6.3154826, 6....","[6.668522, 6.8254284, 7.1000146, 6.5116156, 6....","[3.6088471999999996, 3.6480737999999997, 3.530...",0,6.191265,6.794047,3.607540,0.313813,0.588399,0.353039,-0.482589,-0.040348,-0.378993
1997,"[6.1978028, 6.1585762, 6.1585762, 6.2370294, 6...","[6.7077486, 6.864655, 6.786201800000001, 6.903...","[3.6088471999999996, 3.6480737999999997, 3.648...",0,6.179497,6.812353,3.601002,0.235360,0.313813,0.431493,-0.139266,-0.177081,-0.378796
1998,"[6.1978028, 6.2370294, 6.1978028, 6.1585762, 6...","[6.864655, 6.786201800000001, 6.78620180000000...","[3.6088471999999996, 3.5696205999999995, 3.530...",0,6.101698,6.884922,3.572889,0.470719,0.980665,0.392266,-0.443920,-0.060412,-0.089779


<hr>

Now that the features are extracted, we can chop off the original data and keep only the features for training.

<hr>

In [28]:
df = df.iloc[:, 3:]
df # let's take another look at the DataFrame now

Unnamed: 0,label,x_mean,y_mean,z_mean,x_range,y_range,z_range,xy_corr,yz_corr,xz_corr
0,1,-1.196411,1.313437,8.807025,0.274586,1.216025,0.353039,-0.098106,-0.526352,-0.182488
1,1,-1.148686,1.106844,8.848213,0.274586,1.059118,0.235360,-0.240468,-0.340499,-0.322237
2,1,-1.143455,1.148686,8.851482,0.431493,0.784532,0.431493,0.187816,-0.364041,-0.513297
3,1,-1.117304,1.165684,8.823370,0.235360,0.902212,0.313813,0.091160,-0.215989,-0.238869
4,1,0.271971,-0.209209,9.054153,0.156906,0.235360,0.235360,0.109223,-0.355972,-0.468424
...,...,...,...,...,...,...,...,...,...,...
1995,0,6.158576,6.832620,3.602309,0.627626,1.098345,1.019892,-0.222889,-0.062302,-0.477676
1996,0,6.191265,6.794047,3.607540,0.313813,0.588399,0.353039,-0.482589,-0.040348,-0.378993
1997,0,6.179497,6.812353,3.601002,0.235360,0.313813,0.431493,-0.139266,-0.177081,-0.378796
1998,0,6.101698,6.884922,3.572889,0.470719,0.980665,0.392266,-0.443920,-0.060412,-0.089779


<hr>
We can now train the model. For this week, let's use a Support Vector Machine (SVM) model. SVM is commonly used in binary classification and works well on small-to-medium sized data.

As SVM relies on the concept of distance between data points to find the optimal hyperplane that separates different classes, scaling is required to put features into similar ranges. 
<hr>

In [29]:
from sklearn.svm import SVC
from sklearn.preprocessing import StandardScaler

# Same as week 7, train_test_split and evaluation tools are required 
from sklearn.metrics import classification_report, accuracy_score
from sklearn.model_selection import train_test_split

In [30]:
X = df.iloc[:, 1:]  # All columns after the first one are features
y = df['label']      # The first column is the target (label)

In [31]:
# Normalize the data
scaler = StandardScaler()
X_normalized = scaler.fit_transform(X)
X_normalized

array([[-0.77763893, -0.63530781,  0.90317761, ..., -0.2687097 ,
        -1.09678083, -0.2408278 ],
       [-0.76640983, -0.68909079,  0.91206351, ..., -0.60724429,
        -0.68577841, -0.56632177],
       [-0.76517924, -0.67819803,  0.91276874, ...,  0.41121031,
        -0.73784045, -1.01132414],
       ...,
       [ 0.95779658,  0.79623857, -0.2199722 , ..., -0.36658596,
        -0.32438925, -0.69805523],
       [ 0.9394916 ,  0.81513069, -0.22603718, ..., -1.09105009,
        -0.06638503, -0.02489821],
       [ 0.88365371,  0.90346413, -0.29627812, ..., -1.77642952,
        -1.38540358,  0.97810711]])

In [32]:
# Split the dataset into training and testing sets
# Using 80% of the samples for training, and 20% for testing

X_train, X_test, y_train, y_test = train_test_split(X_normalized, y, test_size=0.2)

<hr>

SVM is a flexible algorithm that can use different "kernels" to train the model. Each kernel employs a specific equation to calculate the similarity between data points. The four commonly used kernels are:

- Linear Kernel: Best for linearly separable data, simplest and fastest.
- Polynomial Kernel: Introduces polynomial decision boundaries, useful for more complex patterns but can be computationally expensive.
- RBF (Gaussian) Kernel: Highly flexible, effective for non-linear data, but computationally intensive and requires careful parameter tuning.
- Sigmoid Kernel: Similar to neural networks, less common in practice, can be harder to control and interpret.

For this task, let's attempt the polynomial kernel. 
<hr>

In [33]:
# Create an SVM classifier with a polynomial kernel
svm_model = SVC(kernel='poly')

# Train the SVM model
svm_model.fit(X_train, y_train)

In [34]:
y_pred = svm_model.predict(X_test)

# Evaluate the model performance
print("Accuracy:", accuracy_score(y_test, y_pred))
print("Classification Report:\n", classification_report(y_test, y_pred))

Accuracy: 0.9225
Classification Report:
               precision    recall  f1-score   support

           0       0.88      0.97      0.92       193
           1       0.97      0.87      0.92       207

    accuracy                           0.92       400
   macro avg       0.93      0.92      0.92       400
weighted avg       0.93      0.92      0.92       400



<hr>

Now that we've trained the model, we can use the test data to evaluate its accuracy. If everything is working correctly, you should see an accuracy of approximately 90%, as the two motions are fairly distinguishable.

Next up, let's explore how to deploy your model to your Raspberry Pi using the `Pickle` package.

<hr>

In [35]:
import pickle

In [36]:
# Save the trained model to a file
# the file is essentially a stream of bytes; it's common to use .pkl as the file extension

with open('svm_model.pkl', 'wb') as file:
    pickle.dump(svm_model, file)

In [37]:
# we will also need the same scaler which was used to train the model
with open('scaler.pkl', 'wb') as file:
    pickle.dump(scaler, file)

<hr>

You can find the two .pkl files in the same directory as this Notebook. You can send the files to the Raspberry Pi using scp (secure file copy over SSH).

`scp svm_model.pkl [user]@[hostname]:~\Documents`

`scp scaler.pkl [user]@[hostname]:~\Documents`

On your Raspberry Pi, you can load the files back into Python objects, provided that the scikit-learn library is of the same verision. You can check this on your Pi using the following command in a terminal:

`pip show scikit-learn`

You should get something like this on your Pi: 

```
Name: scikit-learn
Version: 1.5.1
Summary: A set of python modules for machine learning and data mining
Home-page: https://scikit-learn.org
Author:
Author-email:
License: new BSD
Location: C:\Users\TawLe\AppData\Local\Programs\Python\Python311\Lib\site-packages
Requires: joblib, numpy, scipy, threadpoolctl
Required-by:
```

<hr>

In [38]:
# run this cell to check the version of scikit-learn on your PC

import sklearn
sklearn.__version__

'1.5.1'

<hr>

Check that the two versions match. If not, you might need to upgrade scikit-learn on your Pi: 

`pip install --upgrade scikit-learn --break-system-packages`

Once comepleted, you can move on to loading the model into your Pi.

<b>Note: executing the following cell may not cause error in Jupyter Notebook, but you should continue on your Raspberry Pi to deploy the model.</b>

Create a new Python script in the directory where you saved the two .pkl files (i.e. `~\Documents`). Experiment with the following code:  

<hr>

In [39]:
# Run this on your Pi

import pickle

with open('svm_model.pkl', 'rb') as file:
    model = pickle.load(file)

with open('scaler.pkl', 'rb') as file:
    scaler = pickle.load(file)

print(model) # check that we got the SVC model back correct

SVC(kernel='poly')


<hr>

To use the deployed model and make a single prediction: 

<hr>

In [44]:
# run this on your pi
# a dummy data point created with arbitary values

import warnings
import pickle
import numpy as np
warnings.filterwarnings("ignore")

with open('svm_model.pkl', 'rb') as file:
    model = pickle.load(file)

with open('scaler.pkl', 'rb') as file:
    scaler = pickle.load(file)
    
features = np.array([6.0, 6.3, 2.1, 2.5, 1.3, 4.6, 0, 0, -0.1,]).reshape(1, -1)

# use the same scaler to transform data, as we did in training
features_normalized = scaler.transform(features)

# make a single prediction using fake features
predicted_label = model.predict(features_normalized)[0]

if predicted_label == 0:
    print("Walking")
elif predicted_label == 1:
    print("Sitting")
else:
    print("Unknown activity")

Sitting


<hr>
<h2> Final Challenge: Real-Time Activity Prediction </h2>

Now that we’ve covered everything, your task is to write a Python program that continuously collects acceleration data from the ADXL343 sensor, calculates specific features, and uses the deployed ML model to predict whether the person is walking or sitting. All the code you need can be found in the previous practicals—you just need to put the pieces together.

<b>This task closely resembles what you will need to do for your project, so the solution will not be published.</b> However, if you get stuck, don’t hesitate to discuss it with your tutor.

<hr>