Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GRU layer should have the batch_first=True flag #62

Open
ra1995 opened this issue Apr 20, 2023 · 3 comments
Open

GRU layer should have the batch_first=True flag #62

ra1995 opened this issue Apr 20, 2023 · 3 comments

Comments

@ra1995
Copy link

ra1995 commented Apr 20, 2023

Hi, I was going through the gesticulator codebase and using GRU for speech feature encoding. I noticed that before sending the curr_speech input to GRU, you keep the first dimension as the batch_size and the second dimension as the temporal size. So batch_first=True flag should be used to initialize GRU layer in my opinion. Please let me know if this is the case. Thank you for sharing your awesome work :)

@Svito-zar
Copy link
Owner

Hi @ra1995. Thank you for raising your concern. It has been more than 3 years since I developed this model, so I don't remember exactly how I was doing things. But after a brief look at the code I agree with you. It does seems that the batch_size was the first dimension, which seems common to me. Since the code did not break, I assume that this was probably the default situation in the PyTorch version used ... but I am not sure.
Does this cause you an issue?

@ra1995
Copy link
Author

ra1995 commented Apr 22, 2023

Yes, the model was not converging correctly for my custom dataset without the batch_first argument. After doing the necessary changes, its performing much better

@Svito-zar
Copy link
Owner

Oh, that's very interesting! @ra1995, can you please make a PR with these changes? ( I could do it myself, but if you make a pull request - you will have the credit for finding this)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants