## Pose Estimation Walkthrough

this notebook contains a note while toying with data and the result, also the steps to produce the result

In [1]:
import torch
import syft as sy
import numpy as np
from util import connect_to_workers

hook = sy.TorchHook(torch)  # hook PyTorch ie add extra functionalities to support Federated Learning

LABEL = ['Standing still', 'Sitting and relaxing', 'Lying down', 'Walking', 'Climbing', 'Running']
N_WORKER = 8
BATCH_SIZE = 32
VALID_SIZE = 0.1
GPU_FOUND = torch.cuda.is_available()
print(GPU_FOUND)

False


In [2]:
# simulate another remote client using VirtualWorker
workers = connect_to_workers(hook,n_workers = N_WORKER)

## Reading data

1. Split files detected by glob to train and test
2. Read CSV
3. Construct Dataset class

For the first experiment, i will try to use all features that available so the input will be [1 x 21]

In [3]:
import glob
import csv
import random
import math

# folder path
PATH = 'data/Preprocessed'
TEST_PATH = 'test'
VALID_SIZE = 0.1
# seed for random
random.seed(300)

In [4]:
files = glob.glob(PATH+"/*.csv")
test_files = glob.glob(TEST_PATH+"/*.csv")

# split into train and test
# according to note, only 8 data for training, and the rest for testing.
# so you need to separate manually, take data from /training/Preprocessed to /test
count_valid = math.ceil(VALID_SIZE*len(files))
random.shuffle(files)
valid_files,train_files, = files[0:count_valid],files[count_valid:]

In [5]:
index_to_read = 3
sample_read = []
sample_label = []
label = []

with open(train_files[index_to_read]) as csv_file:
    csv_reader = csv.reader(csv_file, delimiter=',')
    for row in csv_reader:
        if int(row[21]) != 0:
            if int(row[21]) not in label:
                label.append(int(row[21]))
            sample_read.append([float(item) for item in row[0:21]])
            sample_label.append(int(row[21]))

print(train_files[index_to_read])
print('types of label inside: {}'.format(label))

data/Preprocessed\mHealth_subject4.csv
types of label inside: [5, 1, 2, 3, 4, 6]


In [6]:
# print sample data
for index,rep in enumerate(sample_read[:3]):
    print('Chest \t\t\tLeft Angkle')
    print('{:.2f} \t{:.2f} \t{:.2f} \t{:.2f} \t{:.2f} \t{:.2f} \t{:.2f} \t{:.2f} \t{:.2f} \t{:.2f} \t{:.2f} \t{:.2f}'.format(*rep))
    print('Label \t\t\tRight Angkle')
    print('{} \t\t\t{:.2f} \t{:.2f} \t{:.2f} \t{:.2f} \t{:.2f} \t{:.2f} \t{:.2f} \t{:.2f} \t{:.2f}'.format(label[index],*rep[12:]))
    print('')

Chest 			Left Angkle
-9.44 	0.12 	-0.87 	-0.92 	-6.83 	-3.25 	-0.52 	-0.42 	-0.03 	86.30 	-37.38 	-18.81
Label 			Right Angkle
5 			-2.48 	-9.58 	0.72 	-0.36 	-0.44 	-0.04 	19.25 	-31.74 	57.38

Chest 			Left Angkle
-9.35 	0.02 	-1.19 	-0.55 	-7.76 	-4.23 	-0.48 	-0.34 	0.04 	86.50 	-43.62 	-19.48
Label 			Right Angkle
1 			-2.52 	-9.57 	0.66 	-0.36 	-0.44 	-0.04 	17.79 	-34.74 	64.24

Chest 			Left Angkle
-8.83 	-0.12 	-0.91 	0.02 	-8.41 	-5.39 	-0.48 	-0.34 	0.04 	84.41 	-46.16 	-21.72
Label 			Right Angkle
2 			-2.21 	-9.29 	0.88 	-0.36 	-0.44 	-0.04 	16.14 	-37.38 	70.73



### Use Constructed Dataset

To make it easy, we create a dataloader for reading and returning value from csv files

In [15]:
from dataloader import ImuPoseDataset

train_dataset = ImuPoseDataset(files=train_files)
valid_dataset = ImuPoseDataset(files=valid_files)
test_dataset = ImuPoseDataset(files=test_files,return_old_data = True)

print("train dataset: {}".format(len(train_dataset)))
print("valid dataset: {}".format(len(valid_dataset)))
print("test dataset: {}".format(len(test_dataset)))

train dataset: 129024
valid dataset: 18432
test dataset: 36864


### Construct Dataloader

To iterate our dataset we will use Pytorch built-in dataloader. Since we are training with federated learning we will need to transform it into federatedDataset

In [16]:
federated_train_loader = sy.FederatedDataLoader( train_dataset
                                                .federate(workers), # <-- we distribute the dataset across all the workers, it's now a FederatedDataset
                                                batch_size=BATCH_SIZE,
                                                drop_last=True,
                                                shuffle=True)

valid_loader = torch.utils.data.DataLoader(valid_dataset,
                                           batch_size=BATCH_SIZE,
                                           drop_last=True,
                                           shuffle=True)


test_loader = torch.utils.data.DataLoader(test_dataset,
                                           batch_size=1)


In [17]:
test_iter = iter(federated_train_loader)
data,label = next(test_iter)
print(data.location)
print("our training data shape: {}, {}".format(data.shape,data.type()))
print("our training label shape: {}, {}".format(label.shape,label.type()))

<VirtualWorker id:worker1 #objects:67>
our training data shape: torch.Size([32, 1, 21]), torch.FloatTensor
our training label shape: torch.Size([32]), torch.LongTensor


### CNN for classifier

for estimating pose, i will be using shallow CNN. I will be using 3 convolutional layers and 2 fully connected dense layers

I use the reference from 
<b>Lima, Wesllen Sousa, et al. “Human Activity Recognition Using Inertial Sensors in a Smartphone: An Overview.” Sensors, vol. 19, no. 14, 2019, p. 3213.</b> 

that reference a few methods to solve problems related to this.

from
Goodfellow, Ian, et al. Deep Learning. 2016. mentioned that in recognizing human activity, CNN can be used for prediction by treating each row as timestamp data and  processed by 1D convolution
.


In [18]:
from model import CnnModel

model = CnnModel(input_size = 1,num_classes=len(LABEL))

In [19]:
# define some global param
LEARNING_RATE = 0.01
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(),lr=LEARNING_RATE)

In [None]:
from trainer import train

# Construct args for train param
args={
    "epoch":300,
    "batch_size":BATCH_SIZE,
    "learning_patience":10,
    "learning_rate": LEARNING_RATE,
    'checkpoint':'',
    'saved_model_name':'pose_estimator',
    'save_folder':'',
}

# train the Model
train(args, model, criterion, optimizer, federated_train_loader,False,valid_loader)

--------------------------------------------------------------------------------------------
Train params: Epochs:300, Batch Size: 32, Learning Rate: 0.01
		Checkpoint saved to: /content/drive/My Drive/save_train/checkpoint.pt
		Saving Model to: /content/drive/My Drive/save_train/pose_estimator.pt
		Start at Epoch: 1
Train on CPU
--------------------------------------------------------------------------------------------


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


Epoch: 1 	Training Loss: 0.164902 	Validation Loss: 0.309491
Validation loss decreased (inf --> 0.309491).  Saving model ...
F1-score: 0.901506	 Accuracy:0.900391	 Precission:0.924522	 Recall:0.900391
Epoch: 2 	Training Loss: 0.070013 	Validation Loss: 0.236962
Validation loss decreased (0.309491 --> 0.236962).  Saving model ...
F1-score: 0.920083	 Accuracy:0.919488	 Precission:0.938846	 Recall:0.919488
Epoch: 3 	Training Loss: 0.051330 	Validation Loss: 0.231944
Validation loss decreased (0.236962 --> 0.231944).  Saving model ...
F1-score: 0.930699	 Accuracy:0.930556	 Precission:0.949803	 Recall:0.930556
Epoch: 4 	Training Loss: 0.040663 	Validation Loss: 0.245503
Epoch: 5 	Training Loss: 0.034831 	Validation Loss: 0.230911
Validation loss decreased (0.231944 --> 0.230911).  Saving model ...
F1-score: 0.931094	 Accuracy:0.931478	 Precission:0.951129	 Recall:0.931478
Epoch: 6 	Training Loss: 0.031106 	Validation Loss: 0.257184
Epoch: 7 	Training Loss: 0.027842 	Validation Loss: 0.31944

### Evaluating againts test dataset

after finish training, it will have the training log in the log/train_log folder according to training start time. Checkpoint also saved in log/checkpoint.

Now let's evaluate it using another data, to view it's performance

In [20]:
from evaluate import test

args={
    "batch_size":1,
    'model_path':'model/pose_estimator.pt',
    'save_result':'',
    'test_folder':'test',
    'include_null_class':False,
}

test(test_loader, args)

--------------------------------------------------------------------------------------------
Evaluating Model with Batch Size: 1
Data for testing: 36864
--------------------------------------------------------------------------------------------
Evaluating model, please wait...


  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


--------------------------------------------
Model model/pose_estimator.pt Performance:
--------------------------------------------
F1 Score: 0.9463975694444444
Accuracy: 0.9463975694444444
Precission: 0.9463975694444444
Recall: 0.9463975694444444
--------------------------------------------


## Result

the shallow 1D CNN was able to predict pose based on timestamp data with <b>F1-score</b> reaching 94%. But take note that the label class is small only 6, and it doesn't learn outlier data (No activity). For improvement, the Null label can be considered a real class and trained into network to learn it's feature.