Co-Speech Gesture Detection through Multi-Phase Sequence Labeling

This repository contains the code of the WACV2024 paper titled "Co-Speech Gesture Detection through Multi-Phase Sequence Labeling." by Esam Ghaleb, Ilya Burenko, Marlou Rasenberg, Wim Pouw, Peter Uhrig, Judith Holler, Ivan Toni, Aslı Özyürek, Raquel Fernández.

Generate data

Generate data running the following command:

python co_speech_gesture_detection/data/generate_data.py

This script will generate data for 5-fold cross-validation and store corresponding data to

co_speech_gesture_detection/data/data/{0,1,2,3,4}/

Download data and pretrained models

Download pretrained weights for ST_GCN from Google Drive, unzip and put them to

co_speech_gesture_classification/

Download data from this folder, unzip and put it to

co_speech_gesture_classification/data/videos/

After this step the folder should have the following structure:

.
├── README.md
├── main_sequential.py
└── co_speech_gesture_detection/
    ├── __init__.py
    ├── sequential_parser.py
    ├── 27_2_finetuned
    ├── config
    ├── data/
    │   ├── data/
    │   │   ├── 0
    │   │   ├── ...
    │   │   └── 4
    │   ├── full_data/
    │   │   └── gestures_info_mmpose.pkl
    │   └── videos/
    │       └── npy3/
    │           └── *.npy
    ├── feeders
    ├── graph
    ├── loss
    ├── model
    ├── processor
    └── utils

Run training

Run training procedure using the follow command:

python main_sequential.py

Reference

If you make use of the code or any materials in this repository, please cite the following paper:

@inproceedings{ghaleb2023cospeech,
  title={Co-Speech Gesture Detection through Multi-Phase Sequence Labeling},
  author={Ghaleb, Esam and Burenko, Ilya and Rasenberg, Marlou and Pouw, Wim and Uhrig, Peter and Holler, Judith and Toni, Ivan and \"{O}zy\"{u}rek, Aslı and Fernández, Raquel},
  booktitle={Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision},
  pages={}, % TBC
  year={2024},
  address={WAIKOLOA, HAWAII}, % 
  publisher={IEEE/CVF}, 
  doi={}, % TBC
}

Acknowlegements

This work was funded by the NWO through a gravitation grant (024.001.006) to the LiI Consortium. Further funding was provided by the DFG (project number 468466485) and the Arts and Humanities Research Council (grant reference AH/W010720/1) to Peter Uhrig and Anna Wilson (University of Oxford). Raquel Fernández is supported by the European Research Council (ERC CoG grant agreement 819455). We thank the Dialogue Modelling Group members at UvA, especially Alberto Testoni and Ece Takmaz, for their valuable feedback. We extend our gratitude to Kristel de Laat for contributing to the segmentation of co-speech gestures. The authors gratefully acknowledge the scientific support and HPC resources provided by the Erlangen National High-Performance Computing Center (NHR@FAU) of the Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) under the NHR project b105dc to Peter Uhrig. NHR funding is provided by federal and Bavarian state authorities. NHR@FAU hardware is partially funded by the German Research Foundation (DFG) – 440719683.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Co-Speech Gesture Detection through Multi-Phase Sequence Labeling

Generate data

Download data and pretrained models

Run training

Reference

Acknowlegements

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
co_speech_gesture_detection		co_speech_gesture_detection
paper		paper
.gitignore		.gitignore
README.md		README.md
main_sequential.py		main_sequential.py

EsamGhaleb/Multi-Phase-Gesture-Detection

Folders and files

Latest commit

History

Repository files navigation

Co-Speech Gesture Detection through Multi-Phase Sequence Labeling

Generate data

Download data and pretrained models

Run training

Reference

Acknowlegements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages