Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seeking information for CNN model used on MSRVTT #7

Closed
e0lithic opened this issue Aug 14, 2019 · 0 comments
Closed

Seeking information for CNN model used on MSRVTT #7

e0lithic opened this issue Aug 14, 2019 · 0 comments

Comments

@e0lithic
Copy link

Hi,

First of all great work. I have managed to reproduce your results. However could you please provide additional information on the model you have utilized for video feature extraction . I tried experimenting with features extracted using the torchvision Resnet-152 model (pre trained weights). However, they didn't performed particularly well using the trained model you have provided for Dual Encoding.

I assume since you have trained your model using features from a particular Resnet model the dual encoding is biased towards it . In order to achieve a good result using your trained model, the same CNN model needs to be utilized for the feature extraction.

So could you please give more information about the particular variant of the Resnet-152 model you utilized.

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant