Jihoon Chung, Yu Wu, Olga Russakovsky
Princeton University
This is the official implementation of HAT toolkit using SeMask and E2FGVI. Although we provide a demo usage of the toolkit using MMAction2, the toolkit can easily be implemented in other human action recognizer models.
The toolkit makes use of three modified datasets generated from the original video dataset.
- Background Only Videos
Human segmentation is removed from the video frames and inpainted to make the person 'invisible'. - Human Only Videos
Only the human region is remained intacted, rest of the pixels are substituted with a dataset average color. - Action Swap Videos
A background swapped from a different video.
For efficiency, we only keep original frames, human segmentations, and Background Only Videos. We advise to generate Human Only Videos and Action Swap Videos online within the dataloader.
Copy (or softlink) your video dataset in the data/{dataset_name}/ori
folder. We have included two example videos from Kinetics-400. The given videos are sufficient enough to demo the toolkit.
Please check the instructions.
Please check the instructions.
Here, we use MMAction2 as the basis for our human action recognizer in our paper. You can implement your own dataloader if your tool does not involve MMAction2.
Please check the instructions.
We offer pre-generated files (human segmentation, inpainting frames, and original frames) for Kinetics-400 and UCF101.
We are grateful for the support from the National Science Foundation under Grant No. 2112562, Microsoft, Princeton SEAS Project X Innovation Fund, and Princeton First Year Ph.D. Fellowship to JC.