This is the official implementation of the paper 'Multi-Semantic Fusion Model for Generalized Zero-Shot Skeleton-based Action Recognition'. (ICIG 2023)
- Python >= 3.8.13
- Torch >= 1.12.1
- Scikit-Learn
To run the code, the skeleton features should be downloaded first. The skeleton features can be downloaded here. After downloading, rename the synse_resources/ntu_results folder to sk_feats and place it in the root directory of this repository.
bash run60.sh for the training&testing on NTU-60
bash run120.sh for the training&testing on NTU-120
-
Seen-Unseen Splits for GZSSAR:
The seen-unseen splits for NTU-60 & NTU-120 (in
label_splits) are the same with the SynSE, see here for details. -
Skeleton Features:
The skeleton features (in
sk_feats) are extracted through ShiftGCN. -
Text Features:
3 different types of semantic information (i.e., class labels, action description and motion description) are provided in
sem_info.The text features of the semantic information are provided intext_feats. They are extracted through the pre-trained ViT-B/32 which is the text encoder of CLIP.
@inproceedings{Li2023MSF,
title={Multi-semantic fusion model for generalized zero-shot skeleton-based action recognition},
author={Li, Ming-Zhe and Jia, Zhen and Zhang, Zhang and Ma, Zhanyu and Wang, Liang},
booktitle={International Conference on Image and Graphics},
pages={68--80},
year={2023},
organization={Springer}
}