-
Notifications
You must be signed in to change notification settings - Fork 333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
training an autoencoder without z #66
Comments
We have trained autoencoder models without latents, but you may need to adjust some things. You also want to make sure that the encoder is not making a latents if the decoder is not using them. |
thanks for the reply! I only tried to change the decoder, for the encoder part I think it might have no influence on the result (even if it still produces a latent which is not used in the decoder, the gradient will not flow through this part and it only influences the training speed)? |
The loudness signals for most audio are actually very well behaved, so it
should be fine to separately train a network on it (assuming you have
enough data).
…On Sun, Mar 29, 2020 at 1:30 PM Chen Xupeng ***@***.***> wrote:
thanks for the reply! I only tried to change the decoder, for the encoder
part I think it might have no influence on the result (even if it still
produces a latent which is not used in the decoder, the gradient will not
flow through this part and it only influences the training speed)?
I also have one not very related questions concerning loudness generation.
I know DDSP generates loudness through some rules, what if we want to train
an neural network to generate loudness? I found that an ordinary network
might fail...
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#66 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AANFCCJKIXYLFT7IMXHRT7LRJ6VYFANCNFSM4LVI7VSA>
.
|
I agree that generating loudness from audio using neural networks should not be so hard. But what if trying to generate loudness information from some other kind of signals instead of audios? |
Yup, that should be fine (my original suggestion was actually to model the
loudness autoregressively), and in fact we have some research going on in a
related direction currently.
…On Sun, Mar 29, 2020 at 1:42 PM Chen Xupeng ***@***.***> wrote:
I agree that generating loudness from audio using neural networks should
not be so hard. But what if trying to generate loudness information from
some other kind of signals instead of audios?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#66 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AANFCCNZ2JE2OG2332PMMJLRJ6XCJANCNFSM4LVI7VSA>
.
|
wow, sounds interesting! look forward to your next research! |
You can also train small autoregressive models like a simple RNN, since the signal is much simpler than an audio waveform. I'm going to close this issue for now. |
I noticed that in both ddsp/ddsp/training/gin/models/ae.gin and ddsp/ddsp/training/gin/models/ae_abs.gin settings, the model will use z as latent space. I tried to replace
Autoencoder.decoder = @decoders.ZRnnFcDecoder()
toAutoencoder.decoder = @decoders.RnnFcDecoder()
to not use z and test the model's performance, is it the right way? I found that if I did not use z and useae_abs.gin
which jointly learns an encoder for f0, I will get loss nan after around 2000 steps. I doubt if this issue is from z latent missing...The text was updated successfully, but these errors were encountered: