Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference Encoder Padding #7

Open
its-sandy opened this issue Jun 23, 2019 · 1 comment
Open

Reference Encoder Padding #7

its-sandy opened this issue Jun 23, 2019 · 1 comment

Comments

@its-sandy
Copy link

How do we ensure that the padding of the reference mel spectogram is taken into account when the reference encoder is applied on a batch of mels?

@hadaev8
Copy link

hadaev8 commented Sep 23, 2019

Came you to any conclusion?
I faced this problem too, since gst encoder takes zero paddings, the network is able to take into account the duration of the audio, which on my dataset led to the fact that short lines are pronounced slowly, and long fast.

I tried using one-dimensional convolution and masking zero before gru layer, but it worsened the work of tokens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants