You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Great work Phil! In their paper, the authors applied this model to speech modeling, how would you advise on what should I change to use for speech. Because in speech, the data are signals, we do not have num_tokens, nor do we have emb_dim. Our data input is simply, [batch, channel, time]. Any advice?
The text was updated successfully, but these errors were encountered:
jinglescode
changed the title
How to use this for speech?
How to use this for speech/audio generation?
Dec 3, 2020
Great work Phil! In their paper, the authors applied this model to speech modeling, how would you advise on what should I change to use for speech. Because in speech, the data are signals, we do not have
num_tokens
, nor do we haveemb_dim
. Our data input is simply,[batch, channel, time]
. Any advice?The text was updated successfully, but these errors were encountered: