Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training Speed of AVA #80

Closed
hzhang57 opened this issue Jan 8, 2020 · 8 comments
Closed

Training Speed of AVA #80

hzhang57 opened this issue Jan 8, 2020 · 8 comments
Labels
enhancement New feature or request

Comments

@hzhang57
Copy link

hzhang57 commented Jan 8, 2020

Hi, thanks for sharing your implementation for AVA.
Is it convenient for you to add trained model with "configs/AVA/SLOWFAST_32x2_R50_SHORT.yaml" settings ?

I am trying to fine-tune on AVA dataset with pre-trained kinetics weights using SLOWFAST_32x2_R50_SHORT.yaml.

I observed training an epoch is around 3.50 hour on 4 2080Ti gpus (batch-size 24). Training 20 epochs will take around 3 days.

@haooooooqi haooooooqi added the enhancement New feature or request label Jan 9, 2020
@hzhang57
Copy link
Author

It's takes me around 2.5 days to finished 20 epoch training, achieving map of 24.8 on AVA-v2.2 (slowfast-50). Hope to be updated if anyone else get a better performance.

@tonysy
Copy link

tonysy commented Mar 14, 2020

@hzhang57 Hi, could you share your recipe for training the slowfast? The author don't provide valid recipe for slowfast-50. Especially, which pre-trained model is valid for this setting? Thanks a lot.

@hzhang57
Copy link
Author

I used configs/AVA/SLOWFAST_32x2_R50_SHORT.yaml, pretrained weights is
https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/kinetics400/SLOWFAST_8x8_R50.pkl

python tools/run_net.py
--cfg configs/AVA/SLOWFAST_32x2_R50_SHORT.yaml
DATA.PATH_TO_DATA_DIR path_AVA_DATA_REPO
TRAIN.CHECKPOINT_FILE_PATH AVA_DATA_REPO/converged_model/SLOWFAST_8x8_Kinetics
_R50.pkl
TRAIN.CHECKPOINT_TYPE caffe2
SOLVER.BASE_LR 0.1
SOLVER.MAX_EPOCH 20
NUM_GPUS 4
TEST.BATCH_SIZE 4
TRAIN.BATCH_SIZE 24
TRAIN.ENABLE True \

I get 24.8 mAP after around 2.5 days.

@hzhang57
Copy link
Author

hzhang57 commented Mar 15, 2020

i want to train slow fast on my own dataset can you you tell me how can i do it?

I didn't train on customed dataset. My recommendation is you can change customed dataset format according to AVA format, thereby, you can hack the platform to train. Other option is coding a customed dataloader, which I think is more time-consuming...

@haooooooqi
Copy link
Contributor

haooooooqi commented Mar 15, 2020

I think I did provide the recipe.....
Maybe you missed it, you could find it here

https://github.com/facebookresearch/SlowFast/blob/master/configs/AVA/SLOWFAST_32x2_R50_SHORT.yaml

@tonysy
Copy link

tonysy commented Mar 15, 2020

@tonysy
Copy link

tonysy commented Apr 1, 2020

The problem has been resolved. Details are presented in #141 and #112 .
The training speed depends on the hardware of the server(CPU, Memory etc.) As the SlowFast-32x2 actually collects many images(8 imgs for slow path, 32 imgs for fast path) for one clip, the IO burden is heavy and the num_workers of data loader matters a lot.

@Gi-gigi
Copy link

Gi-gigi commented Aug 6, 2021

The problem has been resolved. Details are presented in #141 and #112 .
The training speed depends on the hardware of the server(CPU, Memory etc.) As the SlowFast-32x2 actually collects many images(8 imgs for slow path, 32 imgs for fast path) for one clip, the IO burden is heavy and the num_workers of data loader matters a lot.

Hi, Mr, song yang. can u help me to solve this problem? please give some advice.[(https://github.com//issues/469)]. I use the trained model to do the test alone, but but this problem occurred in the given link.
Thank u in advance ~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants