# Human Data Analytics
### Project D1: Activity Recognition

Our goal is to obtain human activity recognition (HAR) through the use of Wi-Fi signal.


## Imported Stuff

Let's begin by importing the modules we'll use in this project:
 
- <span style = "color:cyan">os</span>: allows us to interact with the operating system
- <span style = "color:cyan">pathlib.Path</span>: allows us to work with file system paths in a easier way
- <span style = "color:cyan">typing.NamedTuple</span>: to use tuple object with named fields (easier to manupulate)
- <span style = "color:cyan">rich.print</span>: this module changes the look of how we print, making it more confortable to visualize and understand

In [1]:
import os
from pathlib import Path
from typing import NamedTuple
from rich import print

Now that we have imported the necessary modules we want to gather the data needed to create the dataset we will use in this project.
To do so we need to change our working directory (*HDA_activity_recognition/notebook* -> *HDA_activity_recognition/data*), that's where `os.chdir("..")` and `os.getcwd()` came to hand  

With `os.chdir("..")` we change the current directory going up to the father directory.  
`os.getcwd()` gets the current working directory.  


**WARNING:** Running the next cell more than one time will change that many time the working directory!



In [2]:
os.chdir("..")
os.getcwd()

'/Users/mattiapiazza/Documents/University/Human Data Analytics/HDA_activity_recognition'

To navigate through the *data* folder and collect all the data needed, let's create the class `MatFile` subclass of `NamedTuple` so we can have an easy access to:  

- `dir_name`: string of the name of the directory containing the data 
- `dir_path`: full path of the directory
- `f_name`: name of the file we want to use

In [3]:
class MatFile(NamedTuple):
    dir_name: str
    dir_path: Path
    f_name: str


data_path = Path("data")  # finds the path to the "data" directory
data_list = data_path.glob(
    "*/*.mat"
)  # returns a list of every .mat file within the "data" directory and its subdirectory [.global(*/*.mat)]

With the list containing all the `.mat` files (`data_list`) we create the `mat_files` list and fill it with a `MatFile` for every file

In [4]:
mat_files = []

for data in data_list:  # data_list is a list with all Path object
    mat_files.append(
        MatFile(
            dir_name=data.parent.name,  # with ".parent" we recive the parent of data. With ".name" we get its name
            dir_path=data.parent.absolute(),  # ".absolute()" returns all the complete path of the parent of data
            f_name=data.name[:-4],
        )
    )

mat_parents = set(
    file.dir_path for file in mat_files
)  # "set()" allows to keep just unique values

print(mat_files[0])
print(mat_parents)

In [5]:
run_from_utils = "cd activity_recognition/utils && poetry run"

In [6]:
# Execute preprocessing from activity_recognitino/utils
for directory in mat_parents:
    os.system(
        f"{run_from_utils} python CSI_phase_sanitization_signal_preprocessing.py '{directory}/' 1 - 1 7 0"
    )

# e.g. python CSI_phase_sanitization_signal_preprocessing.py ../input_files/S1a/ 1 - 1 4 0

Traceback (most recent call last):
  File "/Users/mattiapiazza/Documents/University/Human Data Analytics/HDA_activity_recognition/activity_recognition/utils/CSI_phase_sanitization_signal_preprocessing.py", line 98, in <module>
    with open(name_file, "wb") as fp:  # Pickling
         ^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: './phase_processing/signal_AR3a_C2.txt'
Traceback (most recent call last):
  File "/Users/mattiapiazza/Documents/University/Human Data Analytics/HDA_activity_recognition/activity_recognition/utils/CSI_phase_sanitization_signal_preprocessing.py", line 98, in <module>
    with open(name_file, "wb") as fp:  # Pickling
         ^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: './phase_processing/signal_AR9b_L3.txt'
Traceback (most recent call last):
  File "/Users/mattiapiazza/Documents/University/Human Data Analytics/HDA_activity_recognition/activity_recognition/utils/CSI_phase_sanitization_signal_prepr

In [15]:
# e.g., python CSI_phase_sanitization_H_estimation.py ../input_files/S1a/ 0 S1a_E 1 4 0 -1
file_to_process = mat_files[0]
os.system(
    f"{run_from_utils} python CSI_phase_sanitization_H_estimation.py '{file_to_process.dir_path}/' 0 {file_to_process.f_name} 1 7 0 -1"
)

0

In [26]:
os.system(
    f"{run_from_utils} python CSI_phase_sanitization_signal_reconstruction.py ./phase_processing/ ./processed_phase/ 1 7 0 -1 "
)

/Users/mattiapiazza/Documents/University/Human Data Analytics/HDA_activity_recognition/.venv/bin/python: can't open file '/Users/mattiapiazza/Documents/University/Human Data Analytics/HDA_activity_recognition/data/CSI_phase_sanitization_signal_reconstruction.py': [Errno 2] No such file or directory


512

# We Loaded the Dataset ;\)


In [7]:
run_from_data = Path("data").absolute()
run_from_data

PosixPath('/Users/mattiapiazza/Documents/University/Human Data Analytics/HDA_activity_recognition/data')

In [8]:
os.system(
    f"{run_from_utils} python CSI_doppler_create_dataset_train.py '{run_from_data}/doppler_traces/' S1a,S1b,S1c 31 1 340 30 E,L,W,R,J 4"
)

# e.g., python CSI_doppler_create_dataset_train.py ./doppler_traces/ S1a,S1b,S1c 31 1 340 30 E,L,W,R,J 4

S1a_E_stream_0
S1a_E_stream_1
S1a_E_stream_2
S1a_E_stream_3
S1a_J1_stream_0
S1a_J1_stream_1
S1a_J1_stream_2
S1a_J1_stream_3
S1a_J2_stream_0
S1a_J2_stream_1
S1a_J2_stream_2
S1a_J2_stream_3
S1a_L_stream_0
S1a_L_stream_1
S1a_L_stream_2
S1a_L_stream_3
S1a_R_stream_0
S1a_R_stream_1
S1a_R_stream_2
S1a_R_stream_3
S1a_W_stream_0
S1a_W_stream_1
S1a_W_stream_2
S1a_W_stream_3
S1b_E_stream_0
S1b_E_stream_1
S1b_E_stream_2
S1b_E_stream_3
S1b_J1_stream_0
S1b_J1_stream_1
S1b_J1_stream_2
S1b_J1_stream_3
S1b_J2_stream_0
S1b_J2_stream_1
S1b_J2_stream_2
S1b_J2_stream_3
S1b_L_stream_0
S1b_L_stream_1
S1b_L_stream_2
S1b_L_stream_3
S1b_R_stream_0
S1b_R_stream_1
S1b_R_stream_2
S1b_R_stream_3
S1b_W_stream_0
S1b_W_stream_1
S1b_W_stream_2
S1b_W_stream_3
ERROR - shapes mismatch
ERROR - shapes mismatch
ERROR - shapes mismatch
S1c_E_stream_0
S1c_E_stream_1
S1c_E_stream_2
S1c_E_stream_3
S1c_J1_stream_0
S1c_J1_stream_1
S1c_J1_stream_2
S1c_J1_stream_3
S1c_J2_stream_0
S1c_J2_stream_1
S1c_J2_stream_2
S1c_J2_stream_3
S1c_

0

In [9]:
os.system(
    f"{run_from_utils} python CSI_doppler_create_dataset_test.py '{run_from_data}/doppler_traces/' S2a,S2b,S3a,S4a,S4b,S5a,S6a,S6b,S7a 31 1 340 30 E,L,W,R,J 4"
)

# e.g., python CSI_doppler_create_dataset_test.py ./doppler_traces/ S2a,S2b,S3a,S4a,S4b,S5a,S6a,S6b,S7a 31 1 340 30 E,L,W,R,J 4

S2a_E_stream_0
S2a_E_stream_1
S2a_E_stream_2
S2a_E_stream_3
S2a_J_stream_0
S2a_J_stream_1
S2a_J_stream_2
S2a_J_stream_3
S2a_L_stream_0
S2a_L_stream_1
S2a_L_stream_2
S2a_L_stream_3
S2a_R_stream_0
S2a_R_stream_1
S2a_R_stream_2
S2a_R_stream_3
S2a_W_stream_0
S2a_W_stream_1
S2a_W_stream_2
S2a_W_stream_3
S2b_E_stream_0
S2b_E_stream_1
S2b_E_stream_2
S2b_E_stream_3
S2b_J1_stream_0
S2b_J1_stream_1
S2b_J1_stream_2
S2b_J1_stream_3
S2b_J2_stream_0
S2b_J2_stream_1
S2b_J2_stream_2
S2b_J2_stream_3
S2b_L_stream_0
S2b_L_stream_1
S2b_L_stream_2
S2b_L_stream_3
S2b_R_stream_0
S2b_R_stream_1
S2b_R_stream_2
S2b_R_stream_3
S2b_W_stream_0
S2b_W_stream_1
S2b_W_stream_2
S2b_W_stream_3
S3a_E_stream_0
S3a_E_stream_1
S3a_E_stream_2
S3a_E_stream_3
S3a_J1_stream_0
S3a_J1_stream_1
S3a_J1_stream_2
S3a_J1_stream_3
S3a_J2_stream_0
S3a_J2_stream_1
S3a_J2_stream_2
S3a_J2_stream_3
S3a_L_stream_0
S3a_L_stream_1
S3a_L_stream_2
S3a_L_stream_3
S3a_R_stream_0
S3a_R_stream_1
S3a_R_stream_2
S3a_R_stream_3
S3a_W_stream_0
S3a_W_str

0

In [17]:
os.system(
    f"{run_from_utils} python CSI_network.py '{run_from_data}/doppler_traces/' S1a 100 340 1 32 4 single_ant E,L,W,R,J"
)

# e.g., python CSI_network.py ./doppler_traces/ S1a 100 340 1 32 4 single_ant E,L,W,R,J

[]
[1mModel: "csi_model"[0m
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━┓
┃[1m [0m[1mLayer (type)       [0m[1m [0m┃[1m [0m[1mOutput Shape     [0m[1m [0m┃[1m [0m[1m   Param #[0m[1m [0m┃[1m [0m[1mConnected to     [0m[1m [0m┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━┩
│ input_layer         │ ([96mNone[0m, [32m340[0m, [32m100[0m,  │          [32m0[0m │ -                 │
│ ([94mInputLayer[0m)        │ [32m1[0m)                │            │                   │
├─────────────────────┼───────────────────┼��───────────┼───────────────────┤
│ 1stconv3_1_res_a    │ ([96mNone[0m, [32m340[0m, [32m100[0m,  │          [32m6[0m │ input_layer[[32m0[0m][[32m0[0m] │
│ ([94mConv2D[0m)            │ [32m3[0m)                │            │                   │
├─────────────────────┼───────────────────┼────────────┼───────────────────┤
│ activation_1        │ ([96mNone[0m, [32m340[0m, 

2024-02-26 15:39:38.634828: W tensorflow/core/kernels/data/cache_dataset_ops.cc:302] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.


[1m226/226[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 92ms/step - loss: 1.4472 - sparse_categorical_accuracy: 0.2536

2024-02-26 15:40:01.711992: W tensorflow/core/kernels/data/cache_dataset_ops.cc:302] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.


[1m226/226[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m27s[0m 102ms/step - loss: 1.4467 - sparse_categorical_accuracy: 0.2540 - val_loss: 1.1360 - val_sparse_categorical_accuracy: 0.6531
Epoch 2/25
[1m226/226[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m22s[0m 99ms/step - loss: 0.9941 - sparse_categorical_accuracy: 0.7419 - val_loss: 0.5121 - val_sparse_categorical_accuracy: 0.8205
Epoch 3/25
[1m226/226[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m22s[0m 97ms/step - loss: 0.4032 - sparse_categorical_accuracy: 0.8735 - val_loss: 0.3139 - val_sparse_categorical_accuracy: 0.8875
Epoch 4/25
[1m226/226[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m22s[0m 100ms/step - loss: 0.2513 - sparse_categorical_accuracy: 0.9304 - val_loss: 0.2158 - val_sparse_categorical_accuracy: 0.9138
Epoch 5/25
[1m226/226[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m22s[0m 98ms/step - loss: 0.1733 - sparse_categorical_accuracy: 0.9634 - val_loss: 0.1430 - val_sparse_categorical_accuracy: 0.9786

2024-02-26 15:49:16.477797: W tensorflow/core/kernels/data/cache_dataset_ops.cc:302] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.


[1m70/70[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 27ms/step
[1m69/69[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 37ms/step


2024-02-26 15:49:21.026311: W tensorflow/core/kernels/data/cache_dataset_ops.cc:302] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.


0

In [44]:
os.system(
    f"{run_from_utils} python CSI_network_test.py --dir '{run_from_data}/doppler_traces/' --subdirs S7a --feature_length 100 --sample_length 340 --channels 1 --batch_size 32 --num_tot 4 --name_base single_ant --activities E,L,W,R,J"
)

# e.g., python CSI_network_test.py --dir ./doppler_traces/ --subdirs S7a --feature_length 100 --sample_length 340 --channels 1 --batch_size 32 --num_tot 4 --name_base single_ant --activities E,L,W,R,J

[]
removing single_ant_E,L,W,R,J_cache_complete.data-00000-of-00001
removing single_ant_E,L,W,R,J_cache_complete.index
2872 files read
[1m359/359[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m12s[0m 33ms/step
accuracy 0.9394150417827298
fscore [1.         1.         0.8993187  0.95384615 0.81028152]
[[601   0   0   0   0]
 [  0 552   0   0   0]
 [  0   0 594  19   0]
 [  0   0   0 620   0]
 [  0   0 114  41 331]]


2024-02-27 00:59:16.181558: W tensorflow/core/kernels/data/cache_dataset_ops.cc:302] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


0

In [17]:
os.system(
    f"{run_from_utils} python CSI_network_metrics.py 'complete_different_E,L,W,R,J_S7a_band_80_subband_1' E,L,W,R,J"
)

single antenna - average accuracy 0.717967, average precision 0.719880, average recall 0.715279, average fscore 0.715279
fscores - empty 0.975268, sitting 0.949381, walking 0.559080, running 0.599182, jumping 0.475447
average fscore 0.711672
accuracies - empty 0.984193, sitting 0.989583, walking 0.614600, running 0.502419, jumping 0.485597

-- FINAL DECISION --
max-merge - average accuracy 0.939415, average precision 0.950150, average recall 0.930015, average fscore 0.932689
fscores - empty 1.000000, sitting 1.000000, walking 0.899319, running 0.953846, jumping 0.810282
accuracies - empty 1.000000, sitting 1.000000, walking 0.969005, running 1.000000, jumping 0.681070

accuracies - one antenna 0.717967, two antennas 0.825441, three antennas 0.906250, four antennas 0.939415
fscores - one antenna 0.671384, two antennas 0.805051, three antennas 0.899097, four antennas 0.932689


0

In [26]:
os.system(
    f"{run_from_utils} python CSI_network_metrics_plot.py 'complete_different_E,L,W,R,J_S7a_band_80_subband_1' E,L,W,R,J"
)

# e.g., python CSI_network_metrics_plot.py complete_different_E,L,W,R,J_S7a_band_80_subband_1 E,L,W,R,J

  plt.savefig(name_fig)
  plt.savefig(name_fig)
  plt.savefig(name_fig)


0

In [29]:
import pickle

p = Path(
    "activity_recognition/utils/outputs/complete_different_E,L,W,R,J_S7a_band_80_subband_1.txt"
)

with open(p, "rb") as fp:
    print(pickle.load(fp))