Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mismatch between appearance_num_frames and feature-size #5

Closed
michael-camilleri opened this issue Apr 5, 2022 · 7 comments
Closed

Comments

@michael-camilleri
Copy link

Hi Gorjan

I am running into an issue with setting the number of frames to sample from each video. To put it into context, I need to classify 1s clips at a time, which amount to 25 frames, and hence, cannot sample more than those. The current setup is 32 frames, and I changed appearance_num_frames to 25. However, this may be interfering with the forward_features() method of TransformerResnet. It seems that the ResNet outputs by default 32 sequence length, and not sure if this can be modified.

The error happens in models.py line 267, when it tries to join it with the position embedding. Any idea how I can rectify? am I interpreting the appearance_num_frames correctly to begin with?

@gorjanradevski
Copy link
Owner

Can you post the error too?

@michael-camilleri
Copy link
Author

File "{HOME}/STLT/src/modelling/models.py", line 267, in forward_features
features = features + self.pos_embed
RuntimeError: The size of tensor a (33) must match the size of tensor b (26) at non-singleton dimension 0

Also, noticed that the sampling range deducts 2 (e.g. data_utils.py line 77). Why is that?

Finally (and sorry for being so pedantic), why the multiplication by sample_rate?: specificlaly, you have a comment on line 64 of data_utils.py that says 16 * 2, but the default appearance_num_frames = 32.

Thanks

@gorjanradevski
Copy link
Owner

gorjanradevski commented Apr 5, 2022

Ok, I think I know what's going on. A preface first, some of the code (sample_appearance_indices, sample_train_layout_indices, get_test_layout_indices) is taken from a paper we compare with (for fair and accurate comparison). I checked that their code is correct, but I didn't do a deep dive, hence I can't answer your question -- it's best to copy-paste the indices sampling code in Jupyter notebook and have a look what exactly is going on.

In any case, the sample_appearance_indices method is called such that coord_nr_frames indicates how many frames you want to sample from the video, in train.py it's modified by args.appearance_num_frames, and nr_video_frames is the number of frames in the video, in your case 25. Therefore, if you do frame_indices=sample_appearance_indices(32, 25, train=False), you'll get 32 frame indices sampled from your video, where there will be some duplicates, because your video has less than 32 frames. How many frames to sample, that depends. I would suggest you tune that parameter. In my case, I just took it from the baseline paper.

@michael-camilleri
Copy link
Author

Actually, it seems that sample_appearance_indices would in this case just return only 25 frames (although the sampling is weird).

For my code, I am just doing a simpler sampler, which just spits out all frames if < args.appearance_num_frames and an ordered random choice otherwise.

Incidentally, I am doing some updates on my own personal branch. Let me know if you would be interested in them.

Michael

@gorjanradevski
Copy link
Owner

Good luck with your research! Just beware, randomly sampling frames might cause issues, can't point to a specific reference, but something that you might want to double check in case you're not getting the expected results.

@michael-camilleri
Copy link
Author

Thanks for the heads up (that is why I found their code so weird, but it seems a lot of other systems do it, see e.g. the MMAction2 library)

In my case, I am just using all the frames in the 1s clip anyway.

Will keep you posted

@gorjanradevski
Copy link
Owner

I'm closing this issue, feel free to reopen if the problem persists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants