Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to start training from a pretrained model #18

Closed
jayanthante opened this issue May 10, 2021 · 7 comments
Closed

How to start training from a pretrained model #18

jayanthante opened this issue May 10, 2021 · 7 comments

Comments

@jayanthante
Copy link

hi,

First of all awesome job bringing transformers to video recognition!!
I am comparatively new to the scene. So please bear with me if there are any errors:

My question is how to load the weights of a pretrained model and retrain it for our dataset (with unfreezing the learned weights if necessary)
Let's say we have loaded a pretrained model with 400 classes (Kinetics). I want to train it on my dataset with 10 classes.
So I create the CSV file and split it into train, val and test.
Now I define the model
model = TimeSformer(img_size=224, num_classes=10, num_frames=8, attention_type='divided_space_time', pretrained_model='/path/to/pretrained/model.pyth')

But this model doesn't work for my case because, it's not trained on my dataset.
What do I do next after this step?

@gberta
Copy link
Contributor

gberta commented May 10, 2021

First, you need to create a "video_path label" formatted file (similar to the files used for Kinetics training), where every row depicts a path to a distinct video, and its respective action label.

Afterwards, to finetune from an existing PyTorch checkpoint add the following line in the command line, or you can also add it in the YAML config:

TRAIN.CHECKPOINT_FILE_PATH path_to_your_PyTorch_checkpoint
TRAIN.FINETUNE True

In your case, you should also set MODEL.NUM_CLASSES to be 10. Then, just use the same instructions as provided for Kinetics training. Hope this helps!

@jayanthante
Copy link
Author

Yes, thanks! One more question:

I wanted to increase the sampling from a single video clip. So I went to timesformer/datasets/kinetics.py
Then I changed self._num_clips to the number of samples I want.
Is it possible to change this in the yaml configuration file?
Is there any results on how this sampling influences the accuracy of the model?

@gberta
Copy link
Contributor

gberta commented May 10, 2021

If you want to make it a part of the yaml configuration file, you would need to go to timesformer/config/default.py and add a separate entry TRAIN.NUM_CLIPS. Then in timesformer/datasets/kinetics.py you could just set self._num_clips = cfg.TRAIN.NUM_CLIPS. I actually haven't tried to increase the number of clips used during training. If you observe an improved performance please let me know.

@jayanthante
Copy link
Author

jayanthante commented May 10, 2021

I included
_C.TRAIN.NUM_CLIPS = 1
in timesformer/config/default.py

Then changed self._num_clips = cfg.TRAIN.NUM_CLIPS in timesformer/datasets/kinetics.py

It gives me an error
self.merge_from_other_cfg(loaded_cfg)
File "/home/ubuntu/anaconda3/envs/timesformer/lib/python3.7/site-packages/fvcore/common/config.py", line 123, in merge_from_other_cfg
return super().merge_from_other_cfg(cfg_other)
File "/home/ubuntu/anaconda3/envs/timesformer/lib/python3.7/site-packages/yacs/config.py", line 217, in merge_from_other_cfg
_merge_a_into_b(cfg_other, self, self, [])
File "/home/ubuntu/anaconda3/envs/timesformer/lib/python3.7/site-packages/yacs/config.py", line 478, in _merge_a_into_b
_merge_a_into_b(v, b[k], root, key_list + [k])
File "/home/ubuntu/anaconda3/envs/timesformer/lib/python3.7/site-packages/yacs/config.py", line 491, in _merge_a_into_b
raise KeyError("Non-existent config key: {}".format(full_key))
KeyError: 'Non-existent config key: TRAIN.NUM_CLIPS'

@gberta
Copy link
Contributor

gberta commented May 10, 2021

It should be _C.DATA.TRAIN_NUM_CLIPS = 1. There's no field _C.TRAIN.

@jayanthante
Copy link
Author

Still doesn't work

self.merge_from_other_cfg(loaded_cfg)

File "/home/ubuntu/anaconda3/envs/timesformer/lib/python3.7/site-packages/fvcore/common/config.py", line 123, in merge_from_other_cfg
return super().merge_from_other_cfg(cfg_other)
File "/home/ubuntu/anaconda3/envs/timesformer/lib/python3.7/site-packages/yacs/config.py", line 217, in merge_from_other_cfg
_merge_a_into_b(cfg_other, self, self, [])
File "/home/ubuntu/anaconda3/envs/timesformer/lib/python3.7/site-packages/yacs/config.py", line 478, in _merge_a_into_b
_merge_a_into_b(v, b[k], root, key_list + [k])
File "/home/ubuntu/anaconda3/envs/timesformer/lib/python3.7/site-packages/yacs/config.py", line 491, in _merge_a_into_b
raise KeyError("Non-existent config key: {}".format(full_key))
KeyError: 'Non-existent config key: DATA.TRAIN_NUM_CLIPS'

@gberta
Copy link
Contributor

gberta commented May 10, 2021

To be honest, I'm not sure what the issue might be here. I just tried this on my local codebase, and it worked fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants