You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Vaguely inspired by DiffSinger's Shallow Diffusion mechanism.
Gaussian Diffusion has a tendency to become "lost" at times, producing undesirable results.
The ability to (separately?) train and use a lower quality acoustic model before diffusion could potentially provide a scaffolding to improve model stability, with the trade-off of more processing overhead.
In addition to this, the suggestion is to allow users to select this acoustic model so they may weigh their pros and cons, as they may influence the final result.
(Such as FFConvLSTM → GaussianDiffusion compared to BiLSTMResF0NonAttentiveDecoder → GaussianDiffusion.)
The text was updated successfully, but these errors were encountered:
Vaguely inspired by DiffSinger's Shallow Diffusion mechanism.
Gaussian Diffusion has a tendency to become "lost" at times, producing undesirable results.
The ability to (separately?) train and use a lower quality acoustic model before diffusion could potentially provide a scaffolding to improve model stability, with the trade-off of more processing overhead.
In addition to this, the suggestion is to allow users to select this acoustic model so they may weigh their pros and cons, as they may influence the final result.
(Such as FFConvLSTM → GaussianDiffusion compared to BiLSTMResF0NonAttentiveDecoder → GaussianDiffusion.)
The text was updated successfully, but these errors were encountered: