You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
About crop_audio_window in color_syncnet_train.py. I'm not unsure if the 0-based indexing to 1-based indexing is done correctly. For example, if the file name of the frame is 0.jpg (beginning of the video), current implementation would give a non-zero start_idx for spec, which I think is wrong. It seems to me that for 0.jpg, the start_idx for spec should be 0.
The text was updated successfully, but these errors were encountered:
The pretrained syncnet model was trained on the old code. Doesn't it mean that the model was not trained on the best possible data, thus the model might not be as good as it could be? What do you think?
Yes, it can be improved a little bit, but we do not think that the difference will be large. I say this because originally, it was without that + 1 and it had a very similar performance. But your logic is definitely sound, thank you.
About crop_audio_window in color_syncnet_train.py. I'm not unsure if the 0-based indexing to 1-based indexing is done correctly. For example, if the file name of the frame is 0.jpg (beginning of the video), current implementation would give a non-zero start_idx for spec, which I think is wrong. It seems to me that for 0.jpg, the start_idx for spec should be 0.
The text was updated successfully, but these errors were encountered: