## 0. Data Preprocessing

In [1], the authors utilise the NTU RGB+D 60 datasest for their experiments. Each skeleton of the 60-class dataset is captured at 30fps and consists of 25 joints. 

The preprocessing the authors perform consists of several steps:

1. Denoise the raw skeleton data
2. Remove skeleton files that contain poor data
3. Remove files that contain 2 actors -> this removes 11 action classes

The dataset is split into train/test splits depending on camera view. The first camera view is used for evaluations, while the other two are used for training.

The sequences are cut or repeated until each sequence has a length of T = 75. However, we leave this to be done dynamically in the Dataloader in case we want to use a different sampling mechanism.

### 0.1 Preprocessing

We start by loading the raw skeleton data and filtering any samples with missing skeletons.

In [None]:
import os 
import re
import pickle
import numpy as np
from pathlib import Path
import open3d as o3d

ntu60_path = '/media/ubi-lab-desktop/Extreme Pro/data/nturgb+d_skeletons'
ntu60_files = os.listdir(ntu60_path)

with open('sitc/data/NTU_RGBD_samples_with_missing_skeletons.txt', 'r') as f:
    missing_skeletons = [line.split("\n")[0] for line in f.readlines()[3:]]

ntu60_files = [file for file in ntu60_files if Path(file).stem not in missing_skeletons]

In [None]:
# Since the authors do no specify how the denoising is performed, we will skip this step for now and remove the files that contain 2 actors.
with open('sitc/data/NTU_RGBD_actions_with_two_people.txt', 'r') as f:
    two_people = [line.split("\n")[0] for line in f.readlines()]

ntu60_files = [file for file in ntu60_files if re.findall('[A-Z][^A-Z]*', Path(file).stem)[-1] not in two_people]

In [14]:
print(f"Number of samples in filtered data: {len(ntu60_files)}")

Number of samples in filtered data: 46231


### 0.2 Data Split Generation

We split the data depending on the camera ID, as defined in Section 3.2.2 of [2].

In [19]:
# To generate the train/test splits, we have to filter the samples of camera 1 for validation, and the samples of cameras 2 and 3 for training.
ntu60_train = [file for file in ntu60_files if re.findall('[A-Z][^A-Z]*', Path(file).stem)[1] in ['C002', 'C003']]

ntu60_val = [file for file in ntu60_files if re.findall('[A-Z][^A-Z]*', Path(file).stem)[1] in ['C001']]

print(f"Number of samples in training set: {len(ntu60_train)}")
print(f"Number of samples in validation set: {len(ntu60_val)}")

Number of samples in training set: 30757
Number of samples in validation set: 15474


In [None]:
# We save the train/test split file IDs in a .txt file for easy data loading.
with open('sitc/data/splits/ntu60_train.txt', 'w') as f:
    for file in ntu60_train:
        f.write(f"{Path(file).stem}\n")

with open('sitc/data/splits/ntu60_val.txt', 'w') as f:
    for file in ntu60_val:
        f.write(f"{Path(file).stem}\n")

### 0.3 Data Pairings

We have to construct paired samples where two distinct actors ($p$ and $p'$) perform the same two actions ($a$ and $a'$) under an identical camera view.

Therefore, we additionally save the IDs of all paired samples in separate .txt files, one for each action-camera pair.

In [None]:
for file in ntu60_files:
    action = re.findall('[A-Z][^A-Z]*', Path(file).stem)[-1]
    camera = re.findall('[A-Z][^A-Z]*', Path(file).stem)[1]

    with open(f'sitc/data/NTU_RGBD_{camera}_{action}.txt', 'a') as f:
        f.write(f"{Path(file).stem}\n")


In [None]:
# TODO: randomly pair together two samples of the same action and camera, different person to create the pairs.


- [1] Carr, T., Xu, D., & Lu, A. (2024). Adversary-guided motion retargeting for skeleton anonymization. arXiv. https://arxiv.org/abs/2405.05428
- [2] Shahroudy, A., Liu, J., Ng, T.-T., & Wang, G. (2016). NTU RGB+D: A large scale dataset for 3D human activity analysis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1010–1019).