Skip to content

PyTorch implementation of "SkeletonAgent: An Agentic Interaction Framework for Skeleton-based Action Recognition"

License

Notifications You must be signed in to change notification settings

firework8/SkeletonAgent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SkeletonAgent: An Agentic Interaction Framework for Skeleton-based Action Recognition

Hongda Liu1,2, Yunfan Liu2*, Changlu Wang1,2, Yunlong Wang1, and Zhenan Sun1*

1 NLPR, Institute of Automation, Chinese Academy of Sciences
2 University of Chinese Academy of Sciences

This repository is the official PyTorch implementation of the paper "SkeletonAgent: An Agentic Interaction Framework for Skeleton-based Action Recognition".

Abstract

Recent advances in skeleton-based action recognition increasingly leverage semantic priors from Large Language Models (LLMs) to enrich skeletal representations. However, the LLM is typically queried in isolation from the recognition model and receives no performance feedback. As a result, it often fails to deliver the targeted discriminative cues critical to distinguish similar actions. To overcome these limitations, we propose SkeletonAgent, a novel framework that bridges the recognition model and the LLM through two cooperative agents, i.e., Questioner and Selector. Specifically, the Questioner identifies the most frequently confused classes and supplies them to the LLM as context for more targeted guidance. Conversely, the Selector parses the LLM’s response to extract precise joint-level constraints and feeds them back to the recognizer, enabling finer-grained cross-modal alignment. Comprehensive evaluations on five benchmarks, including NTU RGB+D, NTU RGB+D 120, Kinetics-Skeleton, FineGYM, and UAV-Human, demonstrate that SkeletonAgent consistently outperforms state-of-the-art benchmark methods.

🎨 Installation

git clone https://github.com/firework8/SkeletonAgent.git
cd SkeletonAgent
conda env create -f skeletonagent.yaml
conda activate skeletonagent
pip install -r requirements.txt
pip install -e .

📝 Data Preparation

PYSKL provides links to the pre-processed skeleton pickle annotations.

For Kinetics-Skeleton, since the skeleton annotations are large, please use the Kinetics Annotation Link to download the kpfiles and extract it under $SkeletonAgent/data/k400 for Kinetics-Skeleton. The kpfiles needs to be extracted under Linux. Kinetics-Skeleton requires the dependency Memcached, which could be referred to here.

You could check the official Data Doc of PYSKL for more detailed instructions.

Notably, API information should be configured in the files. Alternatively, you could utilize the semi-version to train.

💫 Training & Testing

Please change the config file depending on what you want. You could use the following commands for training and testing. Basically, we support distributed training on a single server with multiple GPUs.

# Training
bash tools/dist_train.sh {config_name} {num_gpus} {other_options}
# For example: train on NTU RGB+D X-Sub (Joint Modality) with 1 GPU, with validation, and test the checkpoint.
bash tools/dist_train.sh configs/ntu60_xsub/j.py 1 --validate --test-last --test-best
# Testing
bash tools/dist_test.sh {config_name} {checkpoint_file} {num_gpus} {other_options}
# For example: test on NTU RGB+D X-Sub (Joint Modality) with metrics `top_k_accuracy`, and dump the result to `result.pkl`.
bash tools/dist_test.sh configs/ntu60_xsub/j.py checkpoints/CHECKPOINT.pth 1 --eval top_k_accuracy --out result.pkl
# Ensemble the results
cd tools
python ensemble.py

📁 Pretrained Models

All the checkpoints can be downloaded from here.

For the detailed performance of pretrained models, please go to the Model Doc.

🍻 Acknowledgements

This repo is mainly based on PYSKL. We also refer to CTR-GCN, LA-GCN, and GAP.

Thanks to the original authors for their excellent work!

📧 Contact

For any questions, feel free to contact: hongda.liu@cripac.ia.ac.cn

About

PyTorch implementation of "SkeletonAgent: An Agentic Interaction Framework for Skeleton-based Action Recognition"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published