- The main script is ucf_pt.py
- This script uses the pretrained weights for i3d: converted from TF to PyTorch [courtesy Yana Hasson]
- Logdir naming convention:
logs/_MODALITY/_WTS _ _LEARNING_RATE _ EPOCHS
- pytorch=0.3
- tensorboardX
- tqdm
- torchvision
Option | Default | Meaning |
---|---|---|
--trainlist | ../list/flowtrainlist01.txt | trainlist |
--testlist | ../list/flowtestlist01.txt | testlist |
--nstr | None | name string(used to add to default name) |
--modality | rgb | rgb / flow / rgbdsc / flowdsc / flyflow |
--modality2 | None | Second stream mode: rgb / flow / rgbdsc / flowdsc / flyflow / edr1 |
--wts | rgb | (rgb/flow) which weights to load |
--mean | False | Use mean and unsquezzing or linear transformation |
--random | False | To let the first layer have random weights |
--dog | False | Use of Difference of Gaussians(not trainable) as the first filter |
--rdirs | [0,1,2,3,4,5,6,7] | Reichardt directions to extract |
--epochs | 40 | . |
--lr | 0.001 | starting lr |
--batch | 6 | . |
--testbatch | 6 | Test batch size |
--sched | False | Use a scheduler or not |
--lr_steps | [2,6,10,13] | scheduler steps |
--thres | None | Threshold the input to the network to get sparsity |
--gpus | 1 | . |
--numw | 8 | . |
--resume | None | Resume training(specify the file) or not |
--ft | True | Finetune or not |
-
rgb: Simply use the rgb data in /mnt/data1/UCF-101
-
rgbdsc: Use the rgb data in /mnt/data1/UCF-101, convert to grayscale, and then send to Reichardt DS8
-
flow: Simply use the flow data in /mnt/data1/UCF-101_old
-
flowdsc: Use the flow data in /mnt/data1/UCF-101_old, but transform the data from 2 channels to 8 using the transformation matrix. [Not thresholding the output currently]
-
flyflow:
- Use only 2 channel Reichardt output for rgb data in /mnt/data1/UCF-101
- With flyflow and specifying more than 2 rdirs, by default mean will become active to ensure model integrity. That is for the first layer of convolutions, weights will be averaged, and then finetuned.
- rdirs convention:
- 0,1: vertical(1), vertical(-1)
- 2,3: diagnol1(2), diagnol1(-2)
- 4,5: horizontal(3), horizontal(-3)
- 6,7: diagnol2(4), diagnol2(-4)
- This is just a variant of rgbdsc with option of chosing directions
- Training list: flowtrainlist01.txt
- Testing list: flowtestlist01.txt
- The data folder for each video has flow_x and flow_y along with img for all the timesteps in the video.
- Running flyflow modality, with flow weights and 2 GPUs and no scheduler
CUDA_VISIBLE_DEVICES="4,5" python ucf_pt.py --modality flyflow --wts flow --epochs 40 --ft True --lr 0.001 --gpus 2 --trainlist ../list/flowtrainlist01.txt --testlist ../list/flowtestlist01.txt
- Running flowdsc modality, with flow weights and 2 GPUs and no scheduler
CUDA_VISIBLE_DEVICES="4,5" python ucf_pt.py --modality flyflow --wts flow --epochs 40 --ft True --lr 0.001 --gpus 2 --trainlist ../list/flowtrainlist01.txt --testlist ../list/flowtestlist01.txt
- Running rgb modality with difference of gaussian, with flow weights and 2 GPUs and no scheduler
CUDA_VISIBLE_DEVICES="2,7" python ucf_pt.py --modality rgb --wts rgb --random True --dog True --nstr frand_nsched_dog
- In get_set_loader() funtion found in ucf_pt.py modify modlist to include new modality name
- In file dataset.py:
- Change _load_image() function to include your new modality
- If required, import the desired modality function, and include it in the get() method.
- Before the if-elifs: the shape of process_data is ~ [Channels, T, H, W]
- In transforms.py:
- Modify Stack function to include the details for new modality
- Deepmind i3D
- TSN Codebase
- Yana Hasson
- Paper Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset by Joao Carreira and Andrew Zisserman.