Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some questions about ablation study #19

Closed
JerryX1110 opened this issue Jul 14, 2021 · 7 comments
Closed

Some questions about ablation study #19

JerryX1110 opened this issue Jul 14, 2021 · 7 comments

Comments

@JerryX1110
Copy link

I am curious about the performance under the following setting. Have you tried them before?

  1. STCN w/o V.S. W/ Top-k filtering
  2. STCN V.S. STM with the same training strategy (I find that the pretraining of STCN maybe uses more data and some augmentations.
@JerryX1110
Copy link
Author

What's more, I wanna know if the ablation study between "Every 5th + Last" and "Every 5th only" is predicted by the same model under different memory management strategies? Or these two models are trained separately?

@hkchengrex
Copy link
Owner

hkchengrex commented Jul 14, 2021

  1. You can try these by setting top_k=None in inference. The same model can be used. Top-k gives a nice performance boost as I remembered -- you can also compare that result with MiVOS, which is essentially STM+top-k.
  2. STM does not have the official training code so that's difficult to replicate. I used the exact same data/augmentation as my previous work which has a very similar structure and performance as STM (if you disregard top-k). That should be evident that the data strategy performs similarly as STM.
  3. Same model.

@JerryX1110
Copy link
Author

Got it, thanks!

@JerryX1110
Copy link
Author

I think maybe "Every 5th + Last" is worse than "Every 5th only" probably because you sample your video clip in an interval that is >=5. Do you agree with that?

@JerryX1110
Copy link
Author

I mean the sampling strategy in the training stage.

@JerryX1110 JerryX1110 reopened this Jul 15, 2021
@hkchengrex
Copy link
Owner

No.

  1. The minimum distance between sampled frames is 1 -- you are probably mixing up the maximum distance and the minimum distance.
  2. The same sampling is used in our STM training, but "every 5th + last" is still better.

@JerryX1110
Copy link
Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants