A practice program for real time action recognition on C3D
The program depends on Pytorch,OpenCV,PyQt and so on.
-
Install dependency: For PyTorch dependency, see pytorch.org for more details.
For custom dependencies:
conda install opencv pip install tqdm scikit-learn tensorboardX pip install pyqt5
-
Download pretrained model from BaiduYun or GoogleDrive.
-
Configure your dataset and pretrained model path.
-
Train the Model.
python train.py
-
Run the GUI to test on camera.
python GUI.py
Dataset directory tree is shown below
- UCF101
Make sure to put the files as the following structure:
UCF-101 ├── ApplyEyeMakeup │ ├── v_ApplyEyeMakeup_g01_c01.avi │ └── ... ├── ApplyLipstick │ ├── v_ApplyLipstick_g01_c01.avi │ └── ... └── Archery │ ├── v_Archery_g01_c01.avi │ └── ...
After pre-processing, the output dir's structure is as follows:
ucf101
├── ApplyEyeMakeup
│ ├── v_ApplyEyeMakeup_g01_c01
│ │ ├── 00001.jpg
│ │ └── ...
│ └── ...
├── ApplyLipstick
│ ├── v_ApplyLipstick_g01_c01
│ │ ├── 00001.jpg
│ │ └── ...
│ └── ...
└── Archery
│ ├── v_Archery_g01_c01
│ │ ├── 00001.jpg
│ │ └── ...
│ └── ...
These models were trained in machine with NVIDIA TITAN X 12gb GPU. Note that I splited train/val/test data for each dataset using sklearn. If you want to train models using official train/val/test data, you can look in dataset.py, and modify it to your needs.
- UCF101
The paper: "Learning Spatiotemporal Features with 3D Convolutional Networks" by Du Tran1,2 , Lubomir Bourdev1 , Rob Fergus1 , Lorenzo Torresani2 , Manohar Paluri1.