This project uses Matterport's implementation of Mask RCNN to retrieve bounding boxes for detected humans in the Toyota Smarthome dataset, as described in our paper (Climent-Pérez et al. 2021, https://doi.org/10.3390/s21031005).
These calculated bounding boxes are then used in the DAIGroup/i3d
project to extract crops of the images around the
detections.
You have two options, to clone this project and run it (you will need a copy of the dataset), or to download the detections that this network produced.
You can dowload them from this Google Drive link. [download].
There are two directories within the downloaded .tgz
file.
mp4_mrcnn_bbox
contains the bounding boxes calculated with Mask RCNN as is, that is, the raw version.mp4_mrcnn_bbox_nogaps
contains versions of some videos that had gaps in detection, of <60 frames, that have been filled-in with the preprocessing scripts found in the companionDAIGroup/i3d
project (here).
The preprocessing scripts in the DAIGroup/i3d
project will take the best available file: that is, if the detection file is only
present in the mp4_mrcnn_bbox
directory it will take that, but if a corrected version is available in the
mp4_mrcnn_bbox_nogaps
directory it will take that instead.
NOTE: If using them in your research, please cite (Climent-Pérez et al. 2021) below.
- (Das et al. 2019) Das, S., Dai, R., Koperski, M., Minciullo, L., Garattoni, L., Bremond, F., & Francesca, G. (2019). Toyota smarthome: Real-world activities of daily living. In Proceedings of the IEEE International Conference on Computer Vision (pp. 833-842).
- (Climent-Pérez et al. 2021) Climent-Pérez, P., Florez-Revuelta, F. (2021). Improved action recognition with Separable spatio-temporalattention using alternative Skeletal and Video pre-processing, Sensors 21(3), 1005. DOI: https://doi.org/10.3390/s21031005.
Copyright (c) 2017 Matterport, Inc.
Licensed under the MIT License (see LICENSE for details) Written by Waleed Abdulla