Code Author: Shih-Yao (Mike) Lin
The CITI-DailyActivities 3D dataset comprises action videos of three modalities such as RGB videos, depth maps, and 3D skeleton structures. It contains fifteen daily activities including walk, sit down, sit still, use a TV remote, stand up, stand still, pick up books, carry books, put down books, carry a backpack, drop a backpack, make a phone call, drink water, wave hand, and clap.
The dataset has 481 sequences. Among them, 182 sequences contain outlier frames presenting in arbitrary locations and lasting for various durations. Ten actors, including eight males and two females, were recruited for building this dataset, and one of them is left-handed. Each activity is performed by each actor between two and five times. A Microsoft Kinect was used for the collection so that the RGB video, the depth maps, and the inferred skeletons of each activity sequence are all available. The skeleton structures in this work were extracted by using the Kinect SDK v1.8
- RGB Images (21GB) (480x640)
- Depth Images (2.1GB) (320x240)
- 3D Skeletal Data (256MB)
- Labels
NOTE: The dataset contains 481 action examples, where action example #1 - #300 are the actions without outlier frames, and action example # 301 - # 482 are the actions with outlier frames.
python citi_loader.py
@article{lin2017recognizing,
title={Recognizing human actions with outlier frames by observation filtering and completion},
author={Lin, Shih-Yao and Lin, Yen-Yu and Chen, Chu-Song and Hung, Yi-Ping},
journal={ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)},
volume={13},
number={3},
pages={28},
year={2017},
publisher={ACM}
}