You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I used tensorboard to inspect your model structure and found that the pb model you provided just uses one softmax with 256 outputs (8 bits).
However, the paper uses two separated DNNs to predict the coarse and fine part of a sample. Is that because your model reuse the matrix of O1 and O3 (O2 and O4) or you just support 8 bits with mu-law compression?
The text was updated successfully, but these errors were encountered:
@npuichigo I also have rewritten the training code by the graph. It does work and the audio sounds good but the waveform is not different from the target. The systhesised audio is delayed than the target in totally.
I used tensorboard to inspect your model structure and found that the pb model you provided just uses one softmax with 256 outputs (8 bits).
However, the paper uses two separated DNNs to predict the coarse and fine part of a sample. Is that because your model reuse the matrix of O1 and O3 (O2 and O4) or you just support 8 bits with mu-law compression?
The text was updated successfully, but these errors were encountered: