Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alignment with paper #11

Closed
tobyclh opened this issue Feb 3, 2019 · 2 comments
Closed

Alignment with paper #11

tobyclh opened this issue Feb 3, 2019 · 2 comments

Comments

@tobyclh
Copy link

tobyclh commented Feb 3, 2019

Hello, thanks for releasing the pytorch version of the code!
I have a couple questions that sync this repo with the paper (sorry for the pun

  1. fc7 in the paper is a 256-d vector whereas here the output feature is 1024-d (at lease the pretrained model seems to be), is it a newer/better version of this work or am I looking at the wrong place?
  2. in the file SyncNetInstance.py line 107, there is a *4 applied to the sampling of the audio, I suspect that refers to some sort of stride, however I seem to miss the part in the paper mentioning this stride (perhaps too fundamental?), would you explain what it is?
@joonson
Copy link
Owner

joonson commented Feb 4, 2019

Hi,

  1. This is an updated version, but the functionality should be the same.
  2. This is because the audio (spectrograms) is sampled at 100Hz, whereas the video is sampled at 25Hz.

@tobyclh
Copy link
Author

tobyclh commented Feb 9, 2019

Thank you for the response!

@tobyclh tobyclh closed this as completed Feb 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants