New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mismatch between appearance_num_frames and feature-size #5
Comments
Can you post the error too? |
File "{HOME}/STLT/src/modelling/models.py", line 267, in forward_features Also, noticed that the sampling range deducts 2 (e.g. data_utils.py line 77). Why is that? Finally (and sorry for being so pedantic), why the multiplication by sample_rate?: specificlaly, you have a comment on line 64 of data_utils.py that says 16 * 2, but the default appearance_num_frames = 32. Thanks |
Ok, I think I know what's going on. A preface first, some of the code ( In any case, the |
Actually, it seems that sample_appearance_indices would in this case just return only 25 frames (although the sampling is weird). For my code, I am just doing a simpler sampler, which just spits out all frames if < args.appearance_num_frames and an ordered random choice otherwise. Incidentally, I am doing some updates on my own personal branch. Let me know if you would be interested in them. Michael |
Good luck with your research! Just beware, randomly sampling frames might cause issues, can't point to a specific reference, but something that you might want to double check in case you're not getting the expected results. |
Thanks for the heads up (that is why I found their code so weird, but it seems a lot of other systems do it, see e.g. the MMAction2 library) In my case, I am just using all the frames in the 1s clip anyway. Will keep you posted |
I'm closing this issue, feel free to reopen if the problem persists. |
Hi Gorjan
I am running into an issue with setting the number of frames to sample from each video. To put it into context, I need to classify 1s clips at a time, which amount to 25 frames, and hence, cannot sample more than those. The current setup is 32 frames, and I changed appearance_num_frames to 25. However, this may be interfering with the forward_features() method of TransformerResnet. It seems that the ResNet outputs by default 32 sequence length, and not sure if this can be modified.
The error happens in models.py line 267, when it tries to join it with the position embedding. Any idea how I can rectify? am I interpreting the appearance_num_frames correctly to begin with?
The text was updated successfully, but these errors were encountered: